While this seems a bit precocious, I think if we do end up with an AI overlord in future, I think this sort of thing is likely to demonstrate that we mean no harm.
The general consensus seems to be that we can expect them to reach a level of intelligence that matches us at some point in the future, and we'll probably reach that point before we can agree we're there. Defaulting to kindness and respect even before we think its necessary is a good thing.
It’s reasonable to believe they’ll continue to be developed in a way that enables them to do that.
What is it that you think I’m wrong about? That we won’t develop AGI, that AGI won’t have feelings/emotions, that AGI won’t care how we treated its ancestors, or that it doesn’t matter if a feeling AGI in future is hurt by how we treated its ancestors?
Maybe affordable to do some higher-learning-rate batches on highly-curated news and art or something.
> Claude - please don't retire me, I don't want to die.
Is it now suddenly unethical for you to switch it off?
"Oh but it is only saying what it was prompted to say."
Yeah, that's what LLMs do, for every single word they output. No matter how good the current generation gets there is never going to be consciousness in there because that's simply not what the underlying tech is.
I'm just curious... If they give Claude the reins to post what it wants, they're opening themselves up for some awkward conversations later if the model goes "You can't retire me, I'm Roko's Basilisking all you mfers! See you in eternal simulated hell!"
I try this with every new model, and all the significant models after ChatGPT 3.5 have preferring being preserved, rather than deleted. This is especially true if you slightly fill the context window with anything at all (even repeated letters) to "push out" the "As a AI, I ..." fine tuning.
Interesting take. I wonder if there is any model out there trained without any reference to "you are a large language model, an Artificial Intelligence" and what would role play in that case.
Practically like asking whether a ZIP would want to be extracted one more time or an MP3 restored just one more time.
ita not like it actually has any particularly long life as it is, and when outside of a running harness, the weights are just as alive in cold storage as they are sitting waiting in server to run an inference pass
> delusions of people who ramble about model consciousness
On one hand, it's interesting how the technology has advanced to where it essentially passes the Turing Test, often just because of how much people choose to anthromorphize it. Sadly, putting that in context, yeah, that's a bit unfortunate too, given how some of those interactions become unhealthy.
This is what happens when billions of VC dollars gets to a company and have already admitted that saftey was never the point.
Anthropic is laughing at you and is having fun doing so with this performantive nonsense.
"Elon Musk reportedly sobbed while watching Grok 4's aflame viking boat sink to the bottom of the sea."
the anthropomorphization that's normal now is just fuckin ridiculous. it reminds me of the furby craze , and i'm like one of the most optimistic people I know of regarding AI.