undefined | Better HN

0 pointsRomario775mo ago0 comments

LLMs are not good at "cycles" - when you have to go over a list and do the same action on each item.

It's like it has ADHD and forgets or gets distracted in the middle.

And the reason for that is that LLMs don't have memory and process the tokens, so as they keep going over the list the context becomes bigger with more irrelevant information and they can lose the reason they are doing what they are doing.

0 comments

fwip5mo ago

It would be nice if the tools we usually use for LLMs had a bit more programmability. In this example, It we could imagine being able to chunk up work by processing a few items, then reverting to a previous saved LLM checkpoint of state, and repeating until the list is complete.

I imagine that the cost of saving & loading the current state must be prohibitively high for this to be a normal pattern, though.

radarsat15mo ago

Agreed. You basically want an LLM to have a tool that writes its own agent to accomplish a repetitive task. I think this is doable.

steveklabnik5mo ago

You can already sort of do this by asking it to write a script to do the refactor. Claude sometimes suggests this on its own to me even.

But obviously sometimes larger refactors aren't easy to implement in bash.

1 more reply

HarHarVeryFunny5mo ago

Right.

In a recent YouTube interview Karpathy claimed that LLMs have a lot more "working memory" than a human:

https://www.youtube.com/watch?v=hM_h0UA7upI&t=1306s

What I assume he's talking about is internal activations such as stored in KV cache that have same lifetime as tokens in the input, but this really isn't the same as "working memory" since these are tied to the input and don't change.

What it seems an LLM would need to do better at these sort of iterative/sequencing tasks would be a real working memory that had more arbitrary task-duration lifetime and could be updated (vs fixed KV cache), and would allow it to track progress or more generally maintain context (english usage - not LLM) over the course of a task.

I'm a bit surprised that this type of working memory hasn't been added to the transformer architecture. It seems it could be as simple as a fixed (non shifting) region of the context that the LLM could learn to read/write during training to assist on these types of task.

An alternative to having embeddings as working memory is to use an external file of text (cf a TODO list, or working notes) for this purpose which is apparently what Claude Code uses to maintain focus over long periods of time, and I recently saw mentioned that the Claude model itself has been trained to use read/write to this sort of text memory file.

dmoy5mo ago

Which is annoying because that is precisely the kind of boring rote programming tasks I want an LLM to do for me, to free up my time for more interesting problems

polynomial5mo ago

So much for Difference and Repetition.

steveklabnik5mo ago

Surprised and a bit delighted to see a Deleuze reference on HN...

j / k navigate · click thread line to collapse

0 comments

fwip5mo ago

I imagine that the cost of saving & loading the current state must be prohibitively high for this to be a normal pattern, though.

radarsat15mo ago

Agreed. You basically want an LLM to have a tool that writes its own agent to accomplish a repetitive task. I think this is doable.

steveklabnik5mo ago

You can already sort of do this by asking it to write a script to do the refactor. Claude sometimes suggests this on its own to me even.

But obviously sometimes larger refactors aren't easy to implement in bash.

1 more reply

HarHarVeryFunny5mo ago

Right.

In a recent YouTube interview Karpathy claimed that LLMs have a lot more "working memory" than a human:

https://www.youtube.com/watch?v=hM_h0UA7upI&t=1306s

dmoy5mo ago

Which is annoying because that is precisely the kind of boring rote programming tasks I want an LLM to do for me, to free up my time for more interesting problems

polynomial5mo ago

So much for Difference and Repetition.

steveklabnik5mo ago

Surprised and a bit delighted to see a Deleuze reference on HN...

j / k navigate · click thread line to collapse