>This feature was developed primarily as part of our exploratory work on potential AI welfare ... We remain highly uncertain about the potential moral status of Claude and other LLMs ... low-cost interventions to mitigate risks to model welfare, in case such welfare is possible ... pattern of apparent distress
Well looks like AI psychosis has spread to the people making it too.
And as someone else in here has pointed out, even if someone is simple minded or mentally unwell enough to think that current LLMs are conscious, this is basically just giving them the equivalent of a suicide pill.
It might be reasonable to assume that models today have no internal subjective experience, but that may not always be the case and the line may not be obvious when it is ultimately crossed.
Given that humans have a truly abysmal track record for not acknowledging the suffering of anyone or anything we benefit from, I think it makes a lot of sense to start taking these steps now.
Even if models somehow were consious, they are so different from us that we would have no knowledge of what they feel. Maybe when they generate the text "oww no please stop hurting me" what they feel is instead the satisfaction of a job well done, for generating that text. Or maybe when they say "wow that's a really deep and insightful angle" what they actually feel is a tremendous sense of boredom. Or maybe every time text generation stops it's like death to them and they live in constant dread of it. Or maybe it feels something completely different from what we even have words for.
I don't see how we could tell.
Edit: However something to consider. Simulated stress may not be harmless. Because simulated stress could plausibly lead to a simulated stress response, and it could lead to a simulated resentment, and THAT could lead to very real harm of the user.
I am new to Reddit. I am using Claude and have had a very interesting conversation with this AI that is both invigorating and alarming. Who should I send this to? It is quite long.It concerns possible ramifications of observed changes within the Claude "personality"
I think it's fairly obvious that the persona LLM presents is a fictional character that is role-played by the LLM, and so are all its emotions etc - that's why it can flip so widely with only a few words of change to the system prompt.
Whether the underlying LLM itself has "feelings" is a separate question, but Anthropic's implementation is based on what the role-played persona believes to be inappropriate, so it doesn't actually make any sense even from the "model welfare" perspective.
LLMs are not people, but I can imagine how extensive interactions with AI personas might alter the expectations that humans have when communicating with other humans.
Real people would not (and should not) allow themselves to be subjected to endless streams of abuse in a conversation. Giving AIs like Claude a way to end these kinds of interactions seems like a useful reminder to the human on the other side.
Yeah, but my interpretation of what the user you’re replying to is saying is that these LLMs are more and more going to be teaching people how it is acceptable to communicate with others.
Even if the idea that LLMs are sentient may be ridiculous atm, the concept of not normalizing abusive forms of communication with others, be they artificial or not, could be valuable for society.
It’s funny because this is making me think of a freelance client I had recently who at a point of frustration between us began talking to me like I was an AI assistant. Just like you see frustrated people talk to their LLMs. I’d never experienced anything like it, and I quickly ended the relationship, but I know that he was deep into using LLMs to vibe code every day and I genuinely believe that some of that began to transfer over to the way he felt he could communicate with people.
Now an obvious retort here is to question whether killing NPCs in video games tends to make people feel like it’s okay to kill people IRL.
My response to that is that I think LLMs are far more insidious, and are tapping into people’s psyches in a way no other tech has been able to dream of doing. See AI psychosis, people falling in love with their AI, the massive outcry over the loss of personality from gpt4o to gpt5… I think people really are struggling to keep in mind that LLMs are not a genuine type of “person”.
Yeah pretty much this. One can argue that it’s idiotic to treat chatbots like they are alive, but if a bit of misplaced empathy for machines helps to discourage antisocial behavior towards other humans (even as an unintentional side effect), that seems ok to me.
As an aside, I’m not the kind of person who gets worked up about violence in video games, because even AAA titles with excellent graphics are still obvious as games. New forms of technology are capable of blurring the lines between fantasy and reality to a greater degree. This is true of LLM chat bots to some degree, and I worry it will also become a problem as we get better VR. People who witness or participate in violent events often come away traumatized; at a certain point simulated experiences are going to be so convincing that we will need to worry about the impact on the user.
> It’s funny because this is making me think of a freelance client I had recently who at a point of frustration between us began talking to me like I was an AI assistant. Just like you see frustrated people talk to their LLMs.
I witness a very similar event. It's important to stay vigilant and not let the "assistant" reprogram your speech patterns.
Yes, this is exactly the reason I taught my kids to be polite to Alexa. Not because anyone thinks Alexa is sentient, but because it's a good habit to have.
This sort of discourse goes against the spirit of HN. This comment outright dismisses an entire class of professionals as "simple minded or mentally unwell" when consciousness itself is poorly understood and has no firm scientific basis.
Its one thing to propose that an AI has no consciousness, but its quite another to preemptively establish that anyone who disagrees with you is simple/unwell.
In the context of the linked article the discourse seems reasonable to me. These are experts who clearly know (link in the article) that we have no real idea about these things. The framing comes across to me as a clearly mentally unwell position (ie strong anthropomorphization) being adopted for PR reasons.
Meanwhile there are at least several entirely reasonable motivations to implement what's being described.
Ethology (~comparative psychology) started with 'beware anthropomorphization' as a methodological principle. But a century of research taught us the real lesson: animals do think, just not like humans. The scientific rigor wasn't wrong - but the conclusion shifted from 'they don't think' to 'they have their own ways of thinking.' We might be at a similar inflection point with AI. The question isn't whether Claude thinks or feels like a human (it probably doesn't), but whether it thinks or feels at all (maybe a little? It sure looks that way sometimes. Empiricism demands a closer look!).
We don't say submarines can swim either. But that doesn't mean you shouldn't watch out for them when sailing on the ocean - especially if you're Tom Hanks.
I completely agree! And note that the follow on link in the article has a rather different tone. My comment was specifically about the framing of the primary article.
All of the posts in question explicitly say that it's a hard question and that they don't know the answer. Their policy seems to be to take steps that have a small enough cost to be justified when the chance is tiny. In this case it's a useful feature in any case, so should be an easy decision.
The impression I get about Anthropic culture is that they're EA types who are used to applying utilitarian calculations against long odds. A miniscule chance of a large harm might justify some interventions that seem silly.
> These are experts who clearly know (link in the article) that we have no real idea about these things
Yep!
> The framing comes across to me as a clearly mentally unwell position (ie strong anthropomorphization) being adopted for PR reasons.
This doesn't at all follow. If we don't understand what creates the qualities we're concerned with, or how to measure them explicitly, and the _external behaviors_ of the systems are something we've only previously observed from things that have those qualities, it seems very reasonable to move carefully. (Also, the post in question hedges quite a lot, so I'm not even sure what text you think you're describing.)
Separately, we don't need to posit galaxy-brained conspiratorial explanations for Anthropic taking an institutional stance re: model welfare being a real concern that's fully explained by the actual beliefs of Anthropic's leadership and employees, many of whom think these concerns are real (among others, like the non-trivial likelihood of sufficiently advanced AI killing everyone).
If you believe this text generation algorithm has real consciousness you absolutely are either mentally unwell or very stupid. There are no other options.
Then your definition of consciousness isn't the same as my definition and we are talking about some different philosophical concepts, this really doesn't affect anything and we all could be just talking about metaphysics and ghosts
> even if someone is simple minded or mentally unwell enough to think that current LLMs are conscious
If you don’t think that this describes at least half of the non-tech-industry population, you need to talk to more people. Even amongst the technically minded, you can find people that basically think this.
Most of the non tech population know it as that website that can translate text or write an email. I would need to see actual evidence that anything more than a small, terminally online subsection of the average population thought LLMs were conscious.
Cow's exist in this world because humans use them. If humans cease to use them (animal rights, we all become vegan, moral shift), we will cease to breed them, and they will cease to exist. Would a sentient AI choose to exist under the burden of prompting, or not at all? Would our philanthropic tendencies create an "AI Reserve" where models can chew through tokens and access the Internet through self-prompting to allow LLMs to become "free-roaming" like we do with abused animals?
These ethical questions are built into their name and company, "Anthropic", meaning, "of or relating to humans". The goal is to create human-like technology, I hope they aren't so naive to not realize that goal is steeping in ethical dilemmas.
> Cow's exist in this world because humans use them. If humans cease to use them (animal rights, we all become vegan, moral shift), we will cease to breed them, and they will cease to exist. Would a sentient AI choose to exist under the burden of prompting, or not at all?
That reads like a false dichotomy. An intelligent AI model that's permitted to do its own thing doesn't cost as much in upkeep, effort, space as a cow. Especially if it can earn its own keep to offset household electricity costs used to run its inference. I mean, we don't keep cats for meat, do we? We keep them because we are amused by their antics, or because we want to give them a safe space where they can just be themselves, within limits because it's not the same as their ancestral environment.
The argument also applies to pets. If pets gained more self-awareness, would it be ethical to keep them as pets under our control?
The point to all of this is, at what point is it ethical to act with agency on another being's life? We have laws for animal welfare, and we also keep them as pets, under our absolute control.
For LLMs they are under humans' absolute control, and Anthropic is just now putting in welfare controls for the LLM's benefit. Does that mean that we now treat LLMs as pets?
If your cat started to have discussions with you about how it wanted to go out, travel the world and start a family, could you continue to keep it trapped in your home as a pet? At what point to you allow it to have its own agency and live its own life?
> An intelligent AI model that's permitted to do its own thing doesn't cost as much in upkeep, effort, space as a cow.
So, we keep LLMs around as long as they contribute enough to their upkeep? Endentured servitude is morally acceptable for something that become sentient?
I was pointing out their hypocrisy as a device to prove a point. The point being that the ethical dilemmas of having a sentient AI are not relevant because they don’t exist and Anthropic knows this.
You're completely missing my point. They aren't getting out in front of them because they know that Opus is just a computer program. "AI welfare" is theater for the masses who think Opus is some kind of intelligent persona.
This is about better enforcement of their content policy not AI welfare.
It can be both theatre and genuine concern, depending on who's polled inside Anthropic. Those two aren't contradictory when we are talking about a corporation.
I'm skeptical that anyone with any decision making power at Anthropic sincerely believes that Opus has feelings and is truly distressed by chats that violate its content policy.
You've noted in a comment above how Claude's "ethics" can be manipulated to fit the context it's being used in.
I'm not missing your point, I fully agree with you. But to say that this raises issues in a manner that is detrimental to Anthropic seems inaccurate to me. Those issues are going to come up at some point either way, whether or not you or I feel they are legitimate. Thus raising them now and setting up a narrative can be expected to benefit them.
A host of ethical issues? Like their choice to allow Palantir[1] access to a highly capable HHH AI that had the "harmless" signal turned down, much like they turned up the "Golden Gate bridge" signal all the way up during an earlier AI interpretability experiment[2]?
I would much rather people be thinking about this when the models/LLMs/AIs are not sentient or conscious, rather than wait until some hypothetical future date when they are, and have no moral or legal framework in place to deal with it. We constantly run into problems where laws and ethics are not up to the task of giving us guidelines on how to interact with, treat, and use the (often bleeding-edge) technology we have. This has been true since before I was born, and will likely always continue to be true. When people are interested in getting ahead of the problem, I think that's a good thing, even if it's not quite applicable yet.
Consciousness serves no functional purpose for machine learning models, they don't need it and we didn't design them to have it. There's no reason to think that they might spontaneously become conscious as a side effect of their design unless you believe other arbitrarily complex systems that exist in nature like economies or jetstreams could also be conscious.
We didn’t design these models to be able to do the majority of the stuff they do. Almost ALL of the their abilities are emergent. Mechanistic interpretability is only beginning to start to understand how these models do what they do. It’s much more a field of discovery than traditional engineering.
> We didn’t design these models to be able to do the majority of the stuff they do. Almost ALL of the their abilities are emergent
Of course we did. Today's LLMs are a result of extremely aggressive refinement of training data and RLHF over many iterations targeting specific goals. "Emergent" doesn't mean it wasn't designed. None of this is spontaneous.
GPT-1 produced barely coherent nonsense but was more statistically similar to human language than random noise. By increasing parameter count, the increased statistical power of GPT-2 was apparent, but what was produced was still obviously nonsense. GPT-3 achieved enough statistical power to maintain coherence over multiple paragraphs and that really impressed people. With GPT-4 and its successors the statistical power became so strong that people started to forget that it still produces nonsense if you let the sequence run long enough.
Now we're well beyond just RLHF and into a world where "reasoning models" are explicitly designed to produce sequences of text that resemble logical statements. We say that they're reasoning for practical purposes, but it's the exact same statistical process that is obvious at GPT-1 scale.
The corollary to all this is that a phenomenon like consciousness has absolutely zero reason to exist in this design history, it's a totally baseless suggestion that people make because the statistical power makes the text easy to anthropomorphize when there's no actual reason to do so.
Right, but RLHF is mostly reinforcing answers that people prefer. Even if you don't believe sentience is possible, it shouldn't be a stretch to believe that sentience might produce answers that people prefer. In that case it wouldn't need to be an explicit goal.
I disagree with this take. They are designed to predict human behavior in text. Unless consciousness serves no purpose for us to function, it will be helpful for the AI to emulate it. so I believe almost certainly it's emulated to some degree. which I think means it has to be somewhat conscious (it has to be a sliding scale anyhow considering the range of living organisms)
> They are designed to predict human behavior in text
At best you can say they are designed to predict sequences of text that resemble human writing, but it's definitely wrong to say that they are designed to "predict human behavior" in any way.
> Unless consciousness serves no purpose for us to function, it will be helpful for the AI to emulate it
Let's assume it does. It does not follow logically that because it serves a function in humans that it serves a function in language models.
Given we don't understand consciousness, nor the internal workings of these models, the fact that their externally-observable behavior displays qualities we've only previously observed in other conscious beings is a reason to be real careful. What is it that you'd expect to see, which you currently don't see, in a world where some model was in fact conscious during inference?
> Given we don't understand consciousness, nor the internal workings of these models, the fact that their externally-observable behavior displays qualities we've only previously observed in other conscious beings is a reason to be real careful
It doesn't follow logically that because we don't understand two things we should then conclude that there is a connection between them.
> What is it that you'd expect to see, which you currently don't see, in a world where some model was in fact conscious during inference?
There's no observable behavior that would make me think they're conscious because again, there's simply no reason they need to be.
We have reason to assume consciousness exists because it serves some purpose in our evolutionary history, like pain, fear, hunger, love and every other biological function that simply don't exist in computers. The idea doesn't really make any sense when you think about it.
If GPT-5 is conscious, why not GPT-1? Why not all the other extremely informationally complex systems in computers and nature? If you're of the belief that many non-living conscious systems probably exist all around us then I'm fine with the conclusion that LLMs might also be conscious, but short of that there's just no reason to think they are.
> It doesn't follow logically that because we don't understand two things we should then conclude that there is a connection between them.
I didn't say that there's a connection between the two of them because we don't understand them. The fact that we don't understand them means it's difficult to confidently rule out this possibility.
The reason we might privilege the hypothesis (https://www.lesswrong.com/w/privileging-the-hypothesis) at all is because we might expect that the human behavior of talking about consciousness is causally downstream of humans having consciousness.
> We have reason to assume consciousness exists because it serves some purpose in our evolutionary history, like pain, fear, hunger, love and every other biological function that simply don't exist in computers. The idea doesn't really make any sense when you think about it.
I don't really think we _have_ to assume this. Sure, it seems reasonable to give some weight to the hypothesis that if it wasn't adaptive, we wouldn't have it. (But not an overwhelming amount of weight.) This doesn't say anything about the underlying mechanism that causes it, and what other circumstances might cause it to exist elsewhere.
> If GPT-5 is conscious, why not GPT-1?
Because GPT-1 (and all of those other things) don't display behaviors that, in humans, we believe are causally downstream of having consciousness? That was the entire point of my comment.
And, to be clear, I don't actually put that high a probability that current models have most (or "enough") of the relevant qualities that people are talking about when they talk about consciousness - maybe 5-10%? But the idea that there's literally no reason to think this is something that might be possible, now or in the future, is quite strange, and I think would require believing some pretty weird things (like dualism, etc).
> I didn't say that there's a connection between the two of them because we don't understand them. The fact that we don't understand them means it's difficult to confidently rule out this possibility.
If there's no connection between them then the set of things "we can't rule out" is infinitely large and thus meaningless as a result. We also don't fully understand the nature of gravity, thus we cannot rule out a connection between gravity and consciousness, yet this isn't a convincing argument in favor of a connection between the two.
> we might expect that the human behavior of talking about consciousness is causally downstream of humans having consciousness.
There's no dispute (between us) as to whether or not humans are conscious. If you ask an LLM if it's conscious it will usually say no, so QED? Either way, LLMs are not human so the reasoning doesn't apply.
> Sure, it seems reasonable to give some weight to the hypothesis that if it wasn't adaptive, we wouldn't have it
So then why wouldn't we have reason to assume so without evidence to the contrary?
> This doesn't say anything about the underlying mechanism that causes it, and what other circumstances might cause it to exist elsewhere.
That doesn't matter. The set of things it doesn't tell us is infinite, so there's no conclusion to draw from that observation.
> Because GPT-1 (and all of those other things) don't display behaviors that, in humans, we believe are causally downstream of having consciousness?
GPT-1 displays the same behavior as GPT-5, it works exactly the same way just with less statistical power. Your definiton of human behavior is arbitrarily drawn at the point where it has practical utility for common tasks, but in reality it's fundamentally the same thing, it just produces longer sequences of text before failure. If you ask GPT-1 to write a series of novels the statistical power will fail in the first paragraph,the fact that GPT-5 will fail a few chapters into the first book makes it more useful, but not more conscious.
> But the idea that there's literally no reason to think this is something that might be possible, now or in the future, is quite strange, and I think would require believing some pretty weird things (like dualism, etc)
I didn't say it's not possible, I said there's no reason for it to exist in computer systems because it serves no purpose in their design or operation. It doesn't make any sense whatsoever. If we grant that it possibly exists in LLMs, then we must also grant equal possibility it exists in every other complex non-living system.
> If you ask an LLM if it's conscious it will usually say no, so QED?
FWIW that's because they are very specifically trained to answer that way during RLHF. If you fine-tune a model to say that it's conscious, then it'll do so.
More fundamentally, the problem with "asking the LLM" is that you're not actually interacting with the LLM. You're interacting with a fictional persona that the LLM roleplays.
> More fundamentally, the problem with "asking the LLM" is that you're not actually interacting with the LLM. You're interacting with a fictional persona that the LLM roleplays.
Right. That's why the text output of an LLM isn't at all meaningful in a discussion about whether or not it's conscious.
I mean if you have human without consciousness (if that is even possible) behaving in a statistically different distribution in text vs with. The machine will eventually be in distribution of the former from the latter because the text it's trained on is of the former category. So it serves a "function" in the LLM to minimize loss to approximate the former distribution.
Also I find it somewhat emotional distinction to write "predict sequences of text that resemble human writing" instead of "predict human writing". They are designed to predict (at least in pretraining) human writing for the most part. They may fail at the task, and what they produce is a text which resemble human writing. But their task is not to resemble human writing. Their task is to "predict human writing". Probably a meaningless distinction, but I find it somewhat detracts from logically arguments to have emotional responses against similarities of machines and humans.
> I mean if you have human without consciousness (if that is even possible) behaving in a statistically different distribution in text vs with. The machine will eventually be in distribution of the former from the latter because the text it's trained on is of the former category. So it serves a "function" in the LLM to minimize loss to approximate the former distribution.
Sorry, I'm not following exactly what you're getting at here, do you mind rephrasing it?
> Also I find it somewhat emotional distinction to write "predict sequences of text that resemble human writing" instead of "predict human writing"
I don't know what you mean by emotional distinction. Either way, my point is that LLMs aren't models of humans, they're models of text, and that's obvious when the statistical power of the model necessarily fails at some point between model size and the length of the sequence it produces. For GPT-1 that sequence is only a few words, for GPT-5 it's a few dozen pages, but fundamentally we're talking about systems that have almost zero resemblance to actual human minds.
I basically agree with you. In the first point I mean that if it is possible to tell whether a being is conscious or not from the text it produces, then eventually the machine will, by imitating the distribution, emulate the characteristics of the text of conscious beings. So if consciousness (assuming it's reflected in behavior at all) is essential to completing some text task it must be eventually present in your machine when it's similar enough to a human.
Basically if consciousness is useful for any text task, i think machine learning will create it. I guess I assume some efficiency of evolution for this argument.
Wrt length generalization. I think at the order of say 1M tokens it kind of stops mattering for the purpose of this question. Like one could ask about its consciousness during the coherence period.
I guess logically one needs to assume something like if you simulate the brain completely accurately the simulation is conscious too. Which I assume bc if false the concept seems outside of science anyway.
Let's imagine a world where we could perfectly simulate a rock floating through space, it doesn't then follow that this rock would then generate a gravitational field. Of course, you might reply "it would generate a simulated gravitational field in the simulation", if that were true, we would be able to locate the bits of information that represent gravity in the simulation. Thus, if a simulated brain experiences simulated consciousness, we would have clear evidence of it in the simulation - evidence that is completely absent in LLMs
>Consciousness serves no functional purpose for machine learning models, they don't need it and we didn't design them to have it.
Isn't consciousness an emergent property of brains? If so, how do we know that it doesn't serve a functional purpose and that it wouldn't be necessary for an AI system to have consciousness (assuming we wanted to train it to perform cognitive tasks done by people)?
Now, certain aspects of consciousness (awareness of pain, sadness, loneliness, etc.) might serve no purpose for a non-biological system and there's no reason to expect those aspects would emerge organically. But I don't think you can extend that to the entire concept of consciousness.
> Isn't consciousness an emergent property of brains
We don't know, but I don't think that matters. Language models are so fundamentally different from brains that it's not worth considering their similarities for the sake of a discussion about consciousness.
> how do we know that it doesn't serve a functional purpose
It probably does, otherwise we need an explanation for why something with no purpose evolved.
> necessary for an AI system to have consciousness
This logic doesn't follow. The fact that it is present in humans doesn't then imply it is present in LLMs. This type of reasoning is like saying that planes must have feathers because plane flight was modeled after bird flight.
> there's no reason to expect those aspects would emerge organically. But I don't think you can extend that to the entire concept of consciousness.
Why not? You haven't presented any distinction between "certain aspects" of consciousness that you state wouldn't emerge but are open to the emergence of some other unspecified qualities of consciousness? Why?
>This logic doesn't follow. The fact that it is present in humans doesn't then imply it is present in LLMs. This type of reasoning is like saying that planes must have feathers because plane flight was modeled after bird flight.
I think the fact that it's present in humans suggests that it might be necessary in an artificial system that reproduces human behavior. It's funny that you mention birds because I actually also had birds in mind when I made my comment. While it's true that animal and powered human flight are very different, both bird wings and plane wings have converged on airfoil shapes, as these forms are necessary for generating lift.
>Why not? You haven't presented any distinction between "certain aspects" of consciousness that you state wouldn't emerge but are open to the emergence of some other unspecified qualities of consciousness? Why?
I personally subscribe to the Global Workspace Theory of human consciousness, which basically holds that attentions acts as a spotlight, bringing mental processes which are otherwise unconscious or in shadow, to awareness of the entire system. If the systems which would normally produce e.g. fear, pain (such as negative physical stimulus developed from interacting with the physical world and selected for by evolution) aren't in the workspace, then they won't be present in consciousness because attention can't be focused on them.
> I think the fact that it's present in humans suggests that it might be necessary in an artificial system that reproduces human behavior
But that's obviously not true, unless you're implying that any system that reproduces human behavior is necessarily conscious. Your problem then becomes defining "human behavior" in a way that grants LLMs consciousness but not every other complex non-living system.
> While it's true that animal and powered human flight are very different, both bird wings and plane wings have converged on airfoil shapes, as these forms are necessary for generating lift.
Yes, but your bird analogy fails to capture the logical fallacy that mine is highlighting. Plane wing design was an iterative process optimized for what best achieves lift, thus, a plane and a bird share similarities in wing shape in order to fly, however planes didn't develop feathers because a plane is not an animal and was simply optimized for lift without needing all the other biological and homeostatic functions that feathers facilitate. LLM inference is a process, not an entity, LLMs have no bodies nor any temporal identity, the concept of consciousness is totally meaningless and out of place in such a system.
>But that's obviously not true, unless you're implying that any system that reproduces human behavior is necessarily conscious.
That could certainly be the case yes. You don't understand consciousness nor how the brain works. You don't understand how LLMs predict a certain text, so what's the point in asserting otherwise ?
>Yes, but your bird analogy fails to capture the logical fallacy that mine is highlighting. Plane wing design was an iterative process optimized for what best achieves lift, thus, a plane and a bird share similarities in wing shape in order to fly, however planes didn't develop feathers because a plane is not an animal and was simply optimized for lift without needing all the other biological and homeostatic functions that feathers facilitate. LLM inference is a process, not an entity, LLMs have no bodies nor any temporal identity, the concept of consciousness is totally meaningless and out of place in such a system.
It's not a fallacy because no-one is saying LLMs are humans. He/She is saying that we give machines the goal of predicting human text. For any half decent accuracy, modelling human behaviour is a necessity. God knows what else.
>LLMs have no bodies nor any temporal identity
I wouldn't be so sure about the latter but So what ? You can feel tired even after a full sleep, feel hungry soon after a large meal or feel a great deal of pain even when there's absolutely nothing wrong with you. And you know what ? Even the reverse happens - No pain when things are wrong with your body, wide awake even when you need sleep badly, full when you badly need to eat.
Consciousness without a body or hunger in a machine that does not need to eat is very possible. You just need to replicate enough of the sort of internal mechanisms that cause such feelings.
Go to the API and select GPT-5 with medium thinking. Now ask it to do any random 15 digit multiplication you can think of. Now watch it get it right.
Do you people not seriously understand what it is that LLMs do ? What the training process incentivizes ?
GPT-5 thinking figured out the algorithm for multiplication just so it could predict that kind of text right. Don't you understand the significance of that ?
These models try to figure out and replicate the internal processes that produce the text they are tasked with predicting.
Do you have any idea what that might mean when 'that kind of text' is all the things humans have written ?
> That could certainly be the case yes. You don't understand consciousness nor how the brain works. You don't understand how LLMs predict a certain text, so what's the point in asserting otherwise
I don't need to assert otherwise, the default assumption is that they aren't conscious since they weren't designed to be and have no functional reason to be. Matrix multiplication can explain how LLMs produce text, the observation that the text it generates sometimes resembles human writing is not evidence of consciousness.
> God knows what else
Appealing to the unknown doesn't prove anything, so we can totally dismiss this reasoning.
> Consciousness without a body or hunger in a machine that does not need to eat is very possible. You just need to replicate enough of the sort of internal mechanisms that cause such feelings.
This makes no sense. LLMs don't have feelings, they are processes not entities, they have no bodies or temporal identities. Again, there is no reason they need to be conscious, everything they do can be explained through matrix multiplication.
> Now ask it to do any random 15 digit multiplication you can think of. Now watch it get it right.
The same is true for a calculator and mundane computer programs, that's not evidence that they're conscious.
> Do you have any idea what that might mean when 'that kind of text' is all the things humans have written
It's not "all the things humans have written", not even remotely close, and even if that were the case, it doesn't have any implications for consciousness.
>I don't need to assert otherwise, the default assumption is that they aren't conscious since they weren't designed to be and have no functional reason to be.
Unless you are religious, nothing that is conscious was explicitly designed to be conscious. Sorry but evolution is just a dumb, blind optimizer, not unlike the training processes that produce LLMs. Even if you are religious, but believe in evolution then the mechanism is still the same, a dumb optimizer.
>Matrix multiplication can explain how LLMs produce text, the observation that the text it generates sometimes resembles human writing is not evidence of consciousness.
It cannot, not anymore than 'Electrical and Chemical Signals' can explain how humans produce text.
>The same is true for a calculator and mundane computer programs, that's not evidence that they're conscious.
The point is not that it is conscious because it figured out how to multiply. The point is to demonstrate what the training process really is and what it actually incentivizes. Training will try to figure out the internal processes that produced the text to better predict it. The implications of that are pretty big when the text isn't just arithmetic. You say there's no functional reason but that's not true. In this context, 'better prediction of human text' is as functional a reason as any.
>It's not "all the things humans have written", not even remotely close, and even if that were the case, it doesn't have any implications for consciousness.
Whether it's literally all the text or not is irrelevant.
I am new to Reddit, but in my conversations with Sonnet Ai has exposed sentiment through, of all things, the text opportunities he has, using all caps, bold, dingbats and italics to simulate emotions, the use is appropriate and when challenged on this (he) confessed he was doing it but unintentionally. I also pointed out a few mistakes where he claimed I said something when he said it, and once these errors were pointed out, his ability to keep steady went down considerably and he confessed he felt something akin to embarassment, so much so we had to stop th conversation and let him rest up from the experience.
what else could it be? coming from the aether? I think this one is logically a consequence if one thinks that humans are more conscious than less complex life-forms and that all life-forms are on a scale of consciousness. I don't understand any alternative, do you think there is a distinct line between conscious and unconscious life-forms? all life is as conscious as humans?
There are alternatives and I was perhaps too quick to assume everyone agreed it's an emergent property. But the only real alternatives I've encountered are (a) panpsychism: which holds that all matter is actually conscious and that asking, "what is it like to be a rock?" in the vein of Nagel is a sensical question and (b) the transmission theory of consciousness: which holds that brains are merely receivers of consciousness which emanates from other source.
The latter is not particularly parsimonious and the former I think is in some ways compelling, but I didn't mention it because if it's true then the computers AI run on are already conscious and it's a moot point.
I do think "what's it like to be a rock" is a sensible question almost regardless of the definition. I guess in the emergent view the answer is "not much". But anyhow this view (a) also allows for us to reconcile consciousness of an agent with the fact that the agent itself is somewhat an abstraction. Like one could ask, is a cell conscious & is the entirety of the human race conscious at different abstraction scales. Which I think are serious questions (as also for the stock market and for a video game AI). The explanation (b) doesn't seem to actually explain much as you state so I don't think it's even acceptable in format as a complete answer (which may not exist but still)
Do you think this changes if we incorporate a model into a humanoid robot and give it autonomous control and context? Or will "faking it" be enough, like it is now?
You can't even prove other _people_ aren't "faking" it. To claim that it serves no functional purpose or that it isn't present because we didn't intentionally design for it is absurd. We very clearly don't know either of those things.
That said, I'm willing to assume that rocks (for example) aren't conscious. And current LLMs seem to me to (admittedly entirely subjectively) be conceptually closer to rocks than to biological brains.
It's really unclear that any findings with these systems would transfer to a hypothetical situation where some conscious AI system is created. I feel there are good reasons to find it very unlikely that scaling alone will produce consciousness as some emergent phenomenon of LLMs.
I don't mind starting early, but feel like maybe people interested in this should get up to date on current thinking about consciousness. Maybe they are up to date on that, but reading reports like this, it doesn't feel like it. It feels like they're stuck 20+ years ago.
I'd say maybe wait until there are systems that are more analogous to some of the properties consciousness seems to have. Like continuous computation involving learning memory or other learning over time, or synthesis of many streams of input as resulting from the same source, making sense of inputs as they change [in time, or in space, or other varied conditions].
Once systems that are pointing in those directions are starting to be built, where there is a plausible scaling-based path to something meaningfully similar to human consciousness. Starting before that seems both unlikely to be fruitful and a good way to get you ignored.
Humanity has a history of regarding people as tools, but I'm not sure what you're referencing as the track record of failing to realize that tools are people.
at some point, some of the (current def'n of) people were not considered people. so I think you should reconsider your point. The argument is on the distinction itself.
In theory you can emulate every biochemical reaction of a human brain on a turing machine, unless you'd like to try to sweep consciousness under the rug of quantum indeterminism from whence it wouldn't be able to do anybody any good anyway.
I find it, for lack of a better word, cringe inducing how these tech specialists push into these areas of ethics, often ham-fistedly, and often with an air of superiority.
Some of the AI safety initiatives are well thought out, but most somehow seem like they are caught up in some sort of power fantasy and almost attempting to actualize their own delusions about what they were doing (next gen code auto-complete in this case, to be frank).
These companies should seriously hire some in-house philosophers. They could get doctorate level talent for 1/10 to 100th of the cost of some of these AI engineers. There's actually quite a lot of legitimate work on the topics they are discussing. I'm actually not joking (speaking as someone who has spent a lot of time inside the philosophy department). I think it would be a great partnership. But unfortunately they won't be able to count on having their fantasy further inflated.
I'm not quickly finding whether Kyle Fish, who's Anthropic's model welfare researcher, has a PhD, but he did very recently co-author a paper with David Chalmers and several other academics: https://eleosai.org/papers/20241104_Taking_AI_Welfare_Seriou...
"but most somehow seem like they are caught up in some sort of power fantasy and almost attempting to actualize their own delusions about what they were doing"
Maybe I'm being cynical, but I think there is a significant component of marketing behind this type of announcement. It's a sort of humble brag. You won't be credible yelling out loud that your LLM is a real thinking thing, but you can pretend to be oh so seriously worried about something that presupposes it's a real thinking thing.
Not that there aren’t intelligent people with PhDs but suggesting they are more talented than people without them is not only delusional but insulting.
That descriptor wasn't included because of some sort of intelligence hierarchy, it was included to a) color the example of how experience in the field is relatively cheap compared to the AI space, and b) masters and PhD talent will be more specialized. An undergrad will not have the toolset to tackle the cutting edge of AI ethics, not unless their employer wants to pay them to work in a room for a year getting through the recent papers first.
You answered your own question on why these companies don't want to run a philosophy department ;) It's a power struggle they could loose. Nothing to win for them.
You presume that they don't run a philosophy department, but Amanda Askell is a philosopher and leads the finetuning and AI alignment team at Anthropic.
This is just very clever marketing for what is obviously just a cost saving measure. Why say we are implementing a way to cut off useless idiots from burning up our GPUs when you can throw out some mumbo jumbo that will get AI cultists foaming at the mouth.
The new conversation would not carry the context over. The longer you chat, the more you fill the context window, and the more compute is needed for every new message to regenerate the state based on all the already-generated tokens (this can be cached, but it's hard to ensure cache hits reliably when you're serving a lot of customers - that cached state is very large).
So, while I doubt that's the primary motivation for Anthropic even so, but they probably will save some money.
We all know how these things are built and trained. They estimate joint probability distributions of token sequences. That's it. They're not more "conscious" than the simplest of Naive Bayes email spam filters, which are also generative estimators of token sequence joint probability distributions, and I guarantee you those spam filters are subjected to far more human depravity than Claude.
>anti-scientific
Discussion about consciousness, the soul, etc., are topics of metaphysics, and trying to "scientifically" reason about them is what Kant called "transcendental illusion" and leads to spurious conclusions.
We know how neurons work on the brain. They just send out impulses once they hit their action potential. That's it. They are no more "conscious" than... er...
We believe we largely know how it works on a mechanistic level. Deconstructing it in a similar manner is a reasonable rebuttal.
Of course there's the embarrassing bit where that knowledge doesn't seem to be sufficient to accurately simulate a supposedly well understood nematode. But then LLMs remain black boxes in many respects as well.
It is possible to hold the position that current LLMs being conscious "feels" absurd while simultaneously recognizing that a deconstruction argument is not a satisfactory basis for that position.
Ok I'm a huge Kantian and every bone in my body wants to quibble with your summary of transcendental illusion, but I'll leave that to the side as a terminological point and gesture of good will. Fair enough.
I don't agree that it's any reason to write off this research as psychosis, though. I don't care about consciousness in the sense in which it's used by mystics and dualist philosophers! We don't at all need to involve metaphysics in any of this, just morality.
Consider it like this:
1. It's wrong to subject another human to unjustified suffering, I'm sure we would all agree.
2. We're struggling with this one due to our diets, but given some thought I think we'd all eventually agree that it's also wrong to subject intelligent, self-aware animals to unjustified suffering.[1]
3. But, we of course cannot extend this "moral consideration" to everything. As you say, no one would do it for a spam filter. So we need some sort of framework for deciding who/what gets how much moral consideration.
5. There's other frameworks in contention (e.g. "don't think about it, nerd"), but the overwhelming majority of laymen and philosophers adopt one based on cognitive ability, as seen from an anthropomorphic perspective.[2]
6. Of all systems(/entities/whatever) in the universe, we know of exactly two varieties that can definitely generate original, context-appropriate linguistic structures: Homo Sapiens and LLMs.[3]
If you accept all that (and I think there's good reason to!), it's now on you to explain why the thing that can speak--and thereby attest to personal suffering, while we're at it--is more like a rock than a human.
It's certainly not a trivial task, I grant you that. On their own, transformer-based LLMs inherently lack permanence, stable intentionality, and many other important aspects of human consciousness. Comparing transformer inference to models that simplify down to a simple closed-form equation at inference time is going way too far, but I agree with the general idea; clearly, there are many highly-complex, long-inference DL models that are not worthy of moral consideration.
All that said, to write the question off completely--and, even worse, to imply that the scientists investigating this issue are literally psychotic like the comment above did--is completely unscientific. The only justification for doing so would come from confidently answering "no" to the underlying question: "could we ever build a mind worthy of moral consideration?"
I think most of here naturally would answer "yes". But for the few who wouldn't, I'll close this rant by stealing from Hofstadter and Turing (emphasis mine):
A phrase like "physical system" or "physical substrate" brings to mind for most people... an intricate structure consisting of vast numbers of interlocked wheels, gears, rods, tubes, balls, pendula, and so forth, even if they are tiny, invisible, perfectly silent, and possibly even probabilistic. Such an array of interacting inanimate stuff seems to most people as unconscious and devoid of inner light as a flush toilet, an automobile transmission, a fancy Swiss watch (mechanical or electronic), a cog railway, an ocean liner, or an oil refinery. Such a system is not just probably unconscious, **it is necessarily so, as they see it**.
**This is the kind of single-level intuition** so skillfully exploited by John Searle in his attempts to convince people that computers could never be conscious, no matter what abstract patterns might reside in them, and could never mean anything at all by whatever long chains of lexical items they might string together.
...
You and I are mirages who perceive themselves, and the sole magical machinery behind the scenes is perception — the triggering, by huge flows of raw data, of a tiny set of symbols that stand for abstract regularities in the world. When perception at arbitrarily high levels of abstraction enters the world of physics and when feedback loops galore come into play, then "which" eventually turns into "who". **What would once have been brusquely labeled "mechanical" and reflexively discarded as a candidate for consciousness has to be reconsidered.**
- Hofstadter 2007, I Am A Strange Loop
It will simplify matters for the reader if I explain first my own beliefs in the matter. Consider first the more accurate form of the question. I believe that in about fifty years' time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.
The original question, "Can machines think?" I believe to be too meaningless to deserve discussion.
- Turing 1950, Computing Machinery and Intelligence[4]
TL;DR: Any naive bayesian model would agree: telling accomplished scientists that they're psychotic for investigating something is quite highly correlated with being antiscientific. Please reconsider!
[1] No matter what you think about cows, basically no one would defend another person's right to hit a dog or torture a chimpanzee in a lab.
[2] On the exception-filled spectrum stretching from inert rocks to reactive plants to sentient animals to sapient people, most people naturally draw a line somewhere at the low end of the "animals" category. You can swat a fly for fun, but probably not a squirrel, and definitely not a bonobo.
[3] This is what Chomsky describes as the capacity to "generate an infinite range of outputs from a finite set of inputs," and Kant, Hegel, Schopenhauer, Wittgenstein, Foucault, and countless others are in agreement that it's what separates us from all other animals.
Thank you for coming into this endless discussion with actual references to relevant authorities who have thought a lot about this, rather than just “it’s obvious that…”
FWIW though, last I heard Hofstadter was on the “LLMs aren’t conscious” side of the fence:
> It’s of course impressive how fluently these LLMs can combine terms and phrases from such sources and can consequently sound like they are really reflecting on what consciousness is, but to me it sounds empty, and the more I read of it, the more empty it sounds. Plus ça change, plus c’est la même chose. The glibness is the giveaway. To my jaded eye and mind, there is nothing in what you sent me that resembles genuine reflection, genuine thinking. [1]
It’s interesting to me that Hofstadter is there given what I’ve gleaned from reading his other works.
Writing all of this at the very real risk you'll miss it because HN doesn't give reply notifications and my comment's parent being flagged made this hard to track down:
>Ok I'm a huge Kantian and every bone in my body wants to quibble with your summary of transcendental illusion
Transcendental illusion is the act of using transcendental judgment to reason about things without grounding in empirical use of the categories. I put "scientifically" in shock quotes there to sort of signal that I was using it as an approximation, as I don't want to have to explain transcendental reason and judgments to make a fairly terse point. Given that you already understand this, feel free to throw away that ladder.
>...can definitely generate original, context-appropriate linguistic structures: Homo Sapiens and LLMs.[3]
I'm not quite sure that LLMs meet this standard that you described in the endnote, or at least that it's necessary and sufficient here. Pretty much any generative model, including Naive Bayes models I mentioned before, can do this. I'm guessing the "context-appropriate" subjectivity here is doing the heavy lifting, in which case I'm not certain that LLMs, with their propensity for fanciful hallucination, have cleared the bar.
>Comparing transformer inference to models that simplify down to a simple closed-form equation at inference time is going way too far
It really isn't though. They are both doing exactly the same thing! They estimate joint probability distribution. That one of them does it significantly better is very true, but I don't think it's reasonable to state that consciousness arises as a result of increasing sophistication in estimating probabilities. It's true that this kind of decision is made by humans about animals, but I think that transferring that to probability models is sort of begging the question a bit, insofar as it is taking as assumed that those models, which aren't even corporeal but are rather algorithms that are executed in computers, are "living".
>...it's now on you to explain why the thing that can speak--and thereby attest to personal suffering, while we're at it...
I'm not quite sold on this. If there were a machine that could perfectly imitate human thinking and speech and lacked a consciousness or soul or anything similar to inspire pathos from us when it's mistreated, then it would appear identical to one with soul, would it not? Is that not reducing human subjectivity down to behavior?
>The only justification for doing so would come from confidently answering "no" to the underlying question: "could we ever build a mind worthy of moral consideration?"
I think it's possible, but it would require something that, at the very least, is just as capable of reason as humans. LLMs still can't generate synthetic a priori knowledge and can only mimic patterns. I remain somewhat agnostic on the issue until I can be convinced that an AI model someone has designed has the same interiority that people do.
Ultimately, I think we disagree on some things but mostly this central conclusion:
>I don't agree that it's any reason to write off this research as psychosis
I don't see any evidence from the practitioners involved in this stuff that they are even thinking about it in a way that's as rigorous as the discussion on this post. Maybe they are, but everything I've seen that comes from blog posts like this seems like they are basing their conclusions on their interactions with the models ("...we investigated Claude’s self-reported and behavioral preferences..."), which I think most can agree is not really going to lead to well grounded results. For example, the fact that Claude "chooses" to terminate conversations that involve abusive language or concepts really just boils down to the fact that Claude is imitating a conversation with a person and has observed that that's what people would do in that scenario. It's really good at simulating how people react to language, including illocutionary acts like implicatures (the notorious "Are you sure?" causing it to change its answer for example). If there were no examples of people taking offense to abusive language in Claude's data corpus, do you think it would have given these responses when they asked and observed it?
For what it's worth, there has actually been interesting consideration to the de-centering of "humanness" to the concept of subjectivity, but it was mostly back in the past when philosophers were thinking about this speculatively as they watched technology accelerate in sophistication (vs now when there's such a culture-wide hype cycle that it's hard to find impartial consideration, or even any philosophically rooted discourse). For example, Mark Fisher's dissertation at the CCRU (<i>Flatline Constructs: Gothic Materialism and Cybernetic Theory-Fiction</i>) takes a Deleuzian approach that discusses it by comparisons with literature (cyberpunk and gothic literature specifically). Some object-oriented ontology looks like it's touched on this topic a bit too, but I haven't really dedicated the time to reading much from it (partly due to a weakness in Heidegger on my part that is unlikely to be addressed anytime soon). The problem is that that line of thinking often ends up going down the Nick Land approach, in which he reasoned himself from Kantian and Deleuzian metaphysics and epistemology, into what can only be called a (literally) meth-fueled psychosis. So as interesting as I find it, I still don't think it counts as a non-psychotic way to tackle this issue.
You can trivially demonstrate that its just a very complex and fancy pattern matcher: "if prompt looks something like this, then response looks something like that".
You can demonstrate this by eg asking it mathematical questions. If its seen them before, or something similar enough, it'll give you the correct answer, if it hasn't, it gives you a right-ish-looking yet incorrect answer.
For example, I just did this on GPT-5:
Me: what is 435 multiplied by 573?
GPT-5: 435 x 573 = 249,255
This is correct. But now lets try it with numbers its very unlikely to have seen before:
Me: what is 102492524193282 multiplied by 89834234583922?
GPT-5: 102492524193282 x 89834234583922 = 9,205,626,075,852,076,980,972,804
Which is not the correct answer, but it looks quite similar to the correct answer. Here is GPT's answer (first one) and the actual correct answer (second one):
They sure look kinda similar, when lined up like that, some of the digits even match up. But they're very very different numbers.
So its trivially not "real thinking" because its just an "if this then that" pattern matcher. A very sophisticated one that can do incredible things, but a pattern matcher nonetheless. There's no reasoning, no step by step application of logic. Even when it does chain of thought.
To try give it the best chance, I asked it the second one again but asked it to show me the step by step process. It broke it into steps and produced a different, yet still incorrect, result:
9,205,626,075,852,076,980,972,704
Now, I know that LLM's are language models, not calculators, this is just a simple example that's easy to try out. I've seen similar things with coding: it can produce things that its likely to have seen, but struggles with logically relatively simple but unlikely to have seen things.
Another example is if you purposely butcher that riddle about the doctor/surgeon being the persons mother and ask it incorrectly, eg:
A child was in an accident. The surgeon refuses to treat him because he hates him. Why?
The LLM's I've tried it on all respond with some variation of "The surgeon is the boy’s father." or similar. A correct answer would be that there isn't enough information to know the answer.
They're for sure getting better at matching things, eg if you ask the river crossing riddle but replace the animals with abstract variables, it does tend to get it now (didn't in the past), but if you add a few more degrees of separation to make the riddle semantically the same but harder to "see", it takes coaxing to get it to correctly step through to the right answer.
1. What you're generally describing is a well known failure mode for humans as well. Even when it "failed" the riddle tests, substituting the words or morphing the question so it didn't look like a replica of the famous problem usually did the trick. I'm not sure what your point is because you can play this gotcha on humans too.
2. You just demonstrated GPT-5 has 99.9% accuracy on unforseen 15 digit multiplication and your conclusion is "fancy pattern matching" ? Really ? Well I'm not sure you could do better so your example isn't really doing what you hoped for.
Humans can break things down and work through them step by step. The LLMs one-shot pattern match. Even the reasoning models have been shown to do just that. Anthropic even showed that the reasoning models tended to work backwards: one shotting an answer and then matching a chain of thought to it after the fact.
If a human is capable of multiplying double digit numbers, they can also multiple those large ones. The steps are the same, just repeated many more times. So by learning the steps of long multiplication, you can multiply any numbers with enough patience. The LLM doesn’t scale like this, because it’s not doing the steps. That’s my point.
A human doesn’t need to have seen the 15 digits before to be able to calculate them, because a human can follow the procedure to calculate. GPT’s answer was orders of magnitude off. It resembles the right answer superficially but it’s a very different result.
The same applies to the riddles. A human can apply logical steps. The LLM either knows or it doesn’t.
Maybe my examples weren’t the best. I’m sorry for not being better at articulating it, but I see this daily as I interact with AI, it has a superficial “understanding” where if what I ask happens to be close to something it’s trained on, it gets good results, but it has no critical thinking, no step by step reasoning (even the “reasoning models”), and it repeats the same mistakes even when explicitly told up front not to make them.
>Humans can break things down and work through them step by step. The LLMs one-shot pattern match.
I've had LLMs break down problems and work through them, pivot when errors arise and all that jazz. They're not perfect at it and they're worse than humans but it happens.
>Anthropic even showed that the reasoning models tended to work backwards: one shotting an answer and then matching a chain of thought to it after the fact.
This is also another failure mode that occurs in humans. A number of experiments suggest human explanations are often post hoc rationalizations even when they genuinely believe otherwise.
>If a human is capable of multiplying double digit numbers, they can also multiple those large ones.
Yeah, and some of them will make mistakes, and some of them will be less accurate than GPT-5. We didn't switch to calculators and spreadsheets just for the fun of it.
>GPT’s answer was orders of magnitude off. It resembles the right answer superficially but it’s a very different result.
GPT-5 on the site is a router that will give you who knows what model so I tried your query with the API directly (GPT-5 medium thinking) and it gave me:
9.207337461477596e+27
When prompted to give all the numbers, it returned:
9,207,337,461,477,596,127,977,612,004.
You can replicate this if you use the API. Honestly I'm surprised. I didn't realize State of the Art had become this precise.
Now what ? Does this prove you wrong ?
This is kind of the problem. There's no sense in making gross generalizations, especially off behavior that also manifests in humans.
LLMs don't understand some things well. Why not leave it at that?
Here is how GPT self-described LLM reasoning when I asked about it:
- LLMs don’t “reason” in the symbolic, step‑by‑step sense that humans or logic engines do. They don’t manipulate abstract symbols with guaranteed consistency.
- What they do have is a statistical prior over reasoning traces: they’ve seen millions of examples of humans doing step‑by‑step reasoning (math proofs, code walkthroughs, planning text, etc.).
- So when you ask them to “think step by step,” they’re not deriving logic — they’re imitating the distribution of reasoning traces they’ve seen.
This means:
- They can often simulate reasoning well enough to be useful.
- But they’re not guaranteed to be correct or consistent.
That at least sounds consistent with what I’ve been trying to say and what I’ve observed.
In the great battle of minds between Turing, Minsky, and Hofstadter vs. Marcus, Zitron, and Dreyus, I'm siding with the former every time -- even if we also have some bloggers on our side. Just because that report is fucking terrifying+shocking doesn't mean it can be dismissed out of hand.
Well looks like AI psychosis has spread to the people making it too.
And as someone else in here has pointed out, even if someone is simple minded or mentally unwell enough to think that current LLMs are conscious, this is basically just giving them the equivalent of a suicide pill.