This phrase is reminiscent of the language of mereological nihilism, where they say that there are no chairs, only "atoms arranged chair-wise". Intresting distinction, perhaps properly backed by rigorous arguments, but not the kind of language anyone would use casually, or even professionally for long time-period.
Why is it reiterated all the time? Is "anthromorphism" that dangerous? I don't see why we can't have hostile "Sydneys" when we have hostile design, hostile spaces, hostile cities etc.
The way anthropomorphism can be problematic is if it causes a human to react with a reflex consideration for the (simulated) feelings of the machine. Ultimately the behavior of this devise is programmed to maximize the profits of Microsoft - imagine someone buying a product recommended by ChatGPT because "otherwise Sydney would be sad".
Also (edit)
This phrase is reminiscent of the language of mereological nihilism, where they say that there are no chairs, only "atoms arranged chair-wise".
Not really. If I replace your car's engine with a block of wood carves in the shape of an engine, I haven't changed things "only in a matter of speaking".
A Chat bot repeating "nice" or "hostile" phrases does not have internal processes that causes a human to type or say such phrases and so it's future behavior may well be different. Being "nice" may indeed cause the thing repeat "nice" things to you but it's not going to actually "like" you, indeed it's memory of you is gone at the end of the interaction and it's whole "attitude" is changeable by various programmatic actions.
I think this is wrong, because in general, when analogy is good, it is typically good because of the tendency toward allowing for reflex responses. It can't be good and bad for the same reason. It needs to be for a different reason or there isn't logical consistency.
I'll try to explain what I mean by that in an empirical context so you can observe that my model makes general predictions about cognition related to analogical reasoning.
If you have an agent with a lookup table that is the perfect bayesian estimates versus an agent which has to compute the perfect bayesian estimates and there is an aspect of judgement related to time to response - which is a very true aspect of our reality - reflex agents actually out-compete the bayesian agent because they get the same estimate, but minimize response time.
So it can't be the reflex itself which makes an analogical structure bad, since that is also what makes it good. It has to be something else, something which is separate from the reflex itself and tied to the observed utilities as a result of that reflex.
> imagine someone buying a product recommended by ChatGPT because "otherwise Sydney would be sad".
Okay. Lets do that.
If Sydney claims that they would be sad if you don't eat the right amount of vitamin C after you describe symptoms of scurvy, it actually isn't unreasonable to take vitamin C. If you did that, because she said she would be sad, presumably you would be better off. Your expected utilities are better, not worse, by taking vitamin C.
> programmed to maximize the profits of Microsoft
This isn't the objective function of the model. That it might be an objective for people who worked on it does not mean that its responses are congruent with actually doing this.
---
I think to fix your point you would need to change it something like "The way anthropomorphism can be problematic is if it causes a human to react with a reflex consideration for the (simulated) feelings of the machine and this behavior ultimately results in negative utility. Ultimately the behavior of the large language model is learned weights which optimize an objective function that corresponds to seeming like a proper response such that it gets good feedback from humans - so imagine someone getting bad advice that seems reasonable and acting on it, like a code change proposal that on first glance looks good, but in actuality has subtle bugs. Yet, when questioning for the presence of bugs, Sydney implies that not trusting their code to work makes them sad... so the person commits the change without testing it thoroughly. Later, the life support has a race condition as a result of the bug. A hundred people die over ten years before the root cause is determined. No one is sure what other deaths are going to happen, because the type of mistake is one that humans didn't make, but AI do, so people aren't used to seeing it."
I think this is better because it actually ties things to the utilities, rather than the speed of the decision making. You can't generalize speed being bad. It fails in most generalized contexts. You can generalize bad utilities being bad.
That's some weird reasoning. Human emotions are crucial to human existence but we know they also can have bad results. But when emotions are useful to us, it's because we know other people will react similarly to us in a consistent manner. When they're bad, it's generally because someone understands and is using a reaction to get something unrelated to our personal needs and desires.
>> ...programmed to maximize the profits of Microsoft
> This isn't the objective function of the model. That it might be an objective for people who worked on it does not mean that its responses are congruent with actually doing this.
It will be. You can observe the evolution of Google's search system and it has converged to it's current of pushing stuff to sell before everything else. The charter of a public company is maximizing returns to share holders. That is the task of the entire organization
--> You're fixing of my argument is OK but it's pretty easy to imagine it and others from the initial argument imo.
Yeah, probably it will evolve in that direction. I could imagine that happening.
> That's some weird reasoning.
In the AI textbooks I've read, reflex is defined in the context of a reflex agent. You would have sentences like "a reflex agent reacts without thinking" and then an example of that might be "a human who puts their hand on a stove yanks it away without thinking about it" and this is rational because the decision problem doesn't call for correct cognition - it calls for minimization of response time such that the hand isn't burned. To me, when you say reflex decision making is the reason for the danger, it seems to me that this is an inconsistent reason because for other decision making problems, reflex is a help, not a hindrance. I do not consider it wrong to or weird reasoning to use definitions sourced from AI research. I think, given your confusion at my post, you probably weren't intending to argue that being faster means being wrong, but the structure of your reply read that way to me because of the strong association I have for that word and reflex as it relates to optimal decision making by an AI under time constraints. I also think is what you actually said, even if you didn't intend to, but I don't doubt you if you say you meant it another way, because language is imprecise enough that we have to arrive on shared definitions in order to understand each other and it is by no means certain that we start on shared definitions.
I'm also kind of way too literal sometimes. Side-effect of being a programmer, I suppose. And I take this subject way too seriously, because I agree with Paul Graham about surface area of a general idea multiplying impact potential. So I'm trying really really really hard to think well - uh, for example, I've been thinking about this almost continuously whenever I reasonably could ever since my first reply, unable to stop.
It is 1:32 AM for me. I'm taking multiple continuous hours of thinking about this and writing about this and trying to be clear in my thinking about this, because I find it so important. So hopefully that gets across how I am as a person - even if it makes me seem really weird.
> You're fixing of my argument is OK but it's pretty easy to imagine it and others from the initial argument imo.
I'm really trying to drive at the deeper fundamental truths. I feel like logic and analogy are really important and profound and worthy of countless hours of thought about and that the effort will ultimately be rewarded.
Anthromorphism is an instance of thinking via proxy by analogy to another structure. The biggest issue with it is that it carries with it far more baggage. For something like mathematics, you are dropping units: three apples plus three apples to six apples is pretty easy to justify analogically as three unitless plus three unitless to six unitless. The analogical similarity is obvious. For agents, well, it isn't so clear whether analogies are justified. They could be, but there is a lot more that could go wrong because there are so many more assumptions that the analogy is making. As you get more complicated structures, you have more room for error, so you have more tendency to error. So even though analogy is fine, the greater potential for error makes the lazy detector just classify this analogical approach as fallacious. However, it might not be and it might not even be dangerous.
Typically when people disagree with anthropomorphism they do so because the transitional structure isn't similar enough to justify the analogy. For example, one of the more infamous dangers is wasting resources and time seeking intervention from a non-agentic being, like a statue made up of pieces of wood. Since an agent can respond to your requests, including to help, but the piece of wood can't, the analogy doesn't hold. So the proxy relationship that the analogy seeks to make use of isn't reasonable. So you can't trust your conclusions made through analogy to hold in the different decision context. The beliefs aren't generalizing or they don't have reach or they aren't universal or whatever you want to call it that lets you know your thinking isn't working.
In this case it is pretty obvious that the transitional structure has a lot of things that make the analogy valid. The most obvious is that this structure is related to the other structure is an optimization target of the machine learning model. We have mathematical optimization seeking to make these two structures similar. So analogy is going to have some limited applications where it is going to be valid. If you tried to propose something beyond that limited set, for example, that it would walk, because the proxy structure didn't have that as a part of its objective function, you wouldn't have strong reason to suspect congruence.
But that is only one level at which this analogical structure is appropriate or inappropriate or dangerous or non-dangerous. That is on the level of whether the map corresponds with the territory.
Agents are kind of awesome in a way that the rest of reality isn't, because the map ought to not correspond with the territory. So analogies can seem less valid than they really are. With anthropomorphism we are in a unique situation relative to other decision making contexts. We confront both undedicability and also intractability. The former is a regime where logic can create logical paradoxes. The latter is a realm where, because of the limitations imposed, a lot of arguments seem sound and valid, but aren't, because the analogy they imply doesn't correspond to the resource limitations that constraint correct thinking.