undefined | Better HN

0 pointsseanmcdirmid12d ago0 comments

> especially when you have to work out how and why the AI-generated code is structured the way it is.

You could just ask it? Or you don’t trust the AI to answer you honestly?

0 comments

You're anthropomorphizing.

LLMs can't lie nor can they tell the truth. These concepts just don't apply to them.

They also cannot tell you what they were "thinking" when they wrote a piece of code. If you "ask" them what they were thinking, you just get a plausible response, not the "intention" that may or may not have existed in some abstract form in some layer when the system selected tokens*. That information is gone at that point and the LLM has no means to turn that information into something a human could understand anyways. They simply do not have what in a human might be called metacognition. For now. There's lots of ongoing experimental research in this direction though.

Chances are that when you ask an LLM about their output, you'll get the response of either someone who now recognized an issue with their work, or the likeness of someone who believes they did great work and is now defending it. Obviously this is based on the work itself being fed back through the context window, which will inform the response, and thus it may not be entirely useless, but... this is all very far removed from what a conscious being might explain about their thoughts.

The closest you can currently get to this is reading the "reasoning" tokens, though even those are just some selected system output that is then fed back to inform later output. There's nothing stopping the system from "reasoning" that it should say A, but then outputting B. Example: https://i.imgur.com/e8PX84Z.png

* One might say that the LLM itself always considers every possible token and assigns weights to them, so there wouldn't even be a single chain of thought in the first place. More like... every possible "thought" at the same time at varying intensities.

seanmcdirmidOP12d ago

I’m not anthropomorphizing. I’ve been in many situation where the AI wrote some code some way and I had to ask why, it told me why and then we moved on to better solutions as needed. Better if it just wrote the code and its reasoning was still in context, but even if it’s not, it can usually reverse engineer what it wrote well enough. Then it’s a conversation about whether there is a better clearer way to do it, the code improves.

It sounds like you either have access to bad models or you are just imagining what it’s like to use an LLM in this way and haven’t actually tried asking it why it wrote something. The only judgement you need to make is the explanation makes sense or not, not some technical or theoretical argument about where the tokens in the explanation come from. You just ask questions until you can easily verify things for yourself.

Also, pretending that the LLM is still just token predicting and isn’t bringing in a lot of extra context via RAG and using extra tokens for thinking to answer a query is just way out there.

chmod77512d ago

You just steamrolled on, pretty much ignoring the comment you are replying to, made unkind assumptions, and put words in my mouth to boot. I don't mind some aggressive argumentation, but this misses the mark so completely that I have really no idea how to have a constructive conversation this way.

> where the AI wrote some code some way and I had to ask why, it told me why

I just explained that it cannot tell you why. It's simply not how they work. You might as well tell me that it cooked you dinner and did your laundry.

> the code improves.

We can agree on this. The iterative process works. The understanding of it is incorrect. If someone's understanding of a hammer superficially is "tool that drives pointy things into wood", they'll inevitably try to hammer a screw at some point - which might even work, badly.

> It sounds like you either have access to bad models or you are just imagining what it’s like to use an LLM in this way

Quoting this is really enough. You may imagine me sighing.

> Also, pretending that the LLM is still just token predicting

Strawman.

Overall your comment is dancing around engaging with what is being said, so I will not waste my time here.

famouswaffles10d ago

>You're anthropomorphizing.

That is fine. You should, and you'll get the best results doing so.

>LLMs can't lie nor can they tell the truth. These concepts just don't apply to them

Nobody really knows exactly what concepts do and don't apply to them. We simply don't have a great enough understanding of the internal procedures of a trained model.

Ultimately this is all irrelevant. There are multiple indications that the same can be said for humanity, that we perform actions and then rationalize them away even without realizing it. That explanations are often if not always post-hoc rationalizations, lies we tell even ourselves. There's evidence for it. And yet, those explanations can still be useful. And I'm sure OP was trying to point out that is also the case for LLMs.

chmod7756d ago

> We simply don't have a great enough understanding of the internal procedures of a trained model.

There are however limitations imposed by the architecture. An LLM cannot form secret chains of thought (though in theory a closed system outside the end-users' control could hide tokens from at least the user), nor can it model decent metacognition. They also have an at-best weak concept of fact vs fiction in general, which is why we get hallucinations. All of that isn't exactly optimal prerequisites for telling lies.

Also your car isn't a coward because it refuses to run into an obstacle onboard systems detect. The car's designers may have been cowards. Your car also isn't a hero for protecting you during a crash. Neither are LLMs virtuous or liars. If some AI company went out of their way to intentionally construct an LLM such that it outputs untruths, it's not the LLM that is lying to you, it's Open AI/Anthropic/whoever you're interacting with. You're using their system. They are responsible for what it does. If it tells untruths they may have automated the act of telling lies, but it's still them doing it.

> There are multiple indications that the same can be said for humanity, that we perform actions and then rationalize them away even without realizing it

I was hoping to get a response like yours, because I'm genuinely curious about where it leads.

I believe what you said is true in the general sense, where we solve easy problems subconsciously in parts of our brains dedicated to supporting the conscious mind, without then being able to explain how we did it.

However this is a lot less true for engineering tasks, which have a lot more active planning. Sometimes software development means just being a fancy constraint solver, finding a solution that works while applying some best practices. When pressed why one chose that particular solution, one might be tempted to post-hoc rationalize it as the best solution, even though it was just one that fit. But that's merely making it out more than it was, not taking away from the accomplishment of finding one that worked, which likely required some active thinking.

At the other end of the spectrum is making architectural decisions and thinking ahead as one creates something novel. I would be able to tell you why everything exists, especially if I merely added it in anticipation of something that will use it later. There's a ton of conscious planning that goes into these things.

Most coders are still turning over problems they're dealing with at work in their head when they're going to sleep late in the day. This is very much the opposite of solving problems subconsciously.

1 more reply

pipes12d ago

Or ask another model to tell you what the changes do.

cyclopeanutopia12d ago

And you could first read the thing to which you are replying. Don't tell me it was too long.

1 more reply

chmod77512d ago

Did you mean to reply to some other comment? I'm having trouble contextualizing your response - pardon the pun.

j / k navigate · click thread line to collapse

0 comments

chmod77512d ago

You're anthropomorphizing.

LLMs can't lie nor can they tell the truth. These concepts just don't apply to them.

seanmcdirmidOP12d ago

Also, pretending that the LLM is still just token predicting and isn’t bringing in a lot of extra context via RAG and using extra tokens for thinking to answer a query is just way out there.

chmod77512d ago

> where the AI wrote some code some way and I had to ask why, it told me why

I just explained that it cannot tell you why. It's simply not how they work. You might as well tell me that it cooked you dinner and did your laundry.

> the code improves.

> It sounds like you either have access to bad models or you are just imagining what it’s like to use an LLM in this way

Quoting this is really enough. You may imagine me sighing.

> Also, pretending that the LLM is still just token predicting

Strawman.

Overall your comment is dancing around engaging with what is being said, so I will not waste my time here.

famouswaffles10d ago

>You're anthropomorphizing.

That is fine. You should, and you'll get the best results doing so.

>LLMs can't lie nor can they tell the truth. These concepts just don't apply to them

Nobody really knows exactly what concepts do and don't apply to them. We simply don't have a great enough understanding of the internal procedures of a trained model.

chmod7756d ago

> We simply don't have a great enough understanding of the internal procedures of a trained model.

> There are multiple indications that the same can be said for humanity, that we perform actions and then rationalize them away even without realizing it

I was hoping to get a response like yours, because I'm genuinely curious about where it leads.

1 more reply

pipes12d ago

Or ask another model to tell you what the changes do.

cyclopeanutopia12d ago

And you could first read the thing to which you are replying. Don't tell me it was too long.

1 more reply

chmod77512d ago

Did you mean to reply to some other comment? I'm having trouble contextualizing your response - pardon the pun.

j / k navigate · click thread line to collapse