The emotional argument is pretty good I think, but it begs the question of what it’s going to look like when we build a limbic system for robots? It’s adaptive because it’s necessary to optimize utility, so I expect that certain behavioral aspects of mammalian limbic systems will be needed in order to integrate well with humans. In language models, those behavior mechanisms are already somewhat encoded in the vector matrix.
We just don’t have a factual basis for claiming consciousness that really transcends “I think, therefore I am”.
As for the simplistic mechanism, I agree that token prediction doesn’t constitute consciousness, in the same way that a Turing machine does not equal a web browser.
Both require software to become something.
For LLMs that software is the vector matrix created in the training process. It is a very complex algorithm that encodes a substantial subset of human culture.
Data and algorithms are interchangeable. Any algorithm can be performed in a pure lookup table, any lookup table can be extrapolated from a pure algorithm. Data==computation. For LLMs, the algorithm is contained in a n dimensional lookup table of vectors.
Having a fundamentally distinct mode of computational representation does not rule out equivalence.
Uncomfortable thoughts, but it’s where the logic leads.