...and then it still doesn't actually fix it
I recently had a nice conversation looking for some reading suggestions from an LLM. The first round of suggestions were superb, some of them I'd already read, some were entirely new and turned out great. Maybe a dozen or so great suggestions. Then it was like squeezing blood from a stone but I did get a few more. After that it was like talking to a babbling idiot. Repeating the same suggestions over and over, failing to listen to instructions, and generally just being useless.
LLMs are great on the first pass but the further you get away from that they degrade into uselessness.
Sometimes it works well the first time, and sometimes it spits out a summary where you can see what it is confused about, and you can guide it to create a better summary. Sometimes just having that summary in its context gets it over the hump and you can just say "actually I'm going to continue with you; please reference this summary going forward", and sometimes you actually do have to restart the LLM with the new context. And of course sometimes there's nothing that works at all.
I wrote a TON of LVGL code. The result wasn’t perfect for placement, but when I iterated a couple of times, it fixed almost all of the issues. The result is a little hacked together but a bit better than my typical first pass writing UI code. I think this saved me a factor of 10 in time. Next I am going to see how much of the cleanup and factoring of the pile of code it can do.
Next I had it write a bunch of low level code to init hardware. It saved me a little time compared to reading the reference manual, and was more pleasant, but it wasn’t perfectly correct. If I did not have domain expertise I would not have been able to complete the task with the LLM.
From several month of deep work with LLMs I think they are amazing pattern matchers, but not problem solvers. They suggest a solution pattern based on their trained weights. This even could result in real solutions, e.g., when programming Tetris or so, but not when working on somewhat unique problems...
Writing front-end display code and instantiating components to look right is very much playing to the model’s strength, though. A carefully written sentence plus context would become 40 lines of detail-dense but formulaic code.
(I have also had a lot of luck asking it to make a first pass at typesetting things in Tex, too, for similar reasons)
This kind of sums up my experience with LLMs too. They save me a lot of time reading documentation, but I need to review a lot of what they write, or it will just become too brittle and verbose.
I asked it to remove the comment, which it enthusiastically agreed to, and then... didn't. I couldn't tell if it was the LLM being dense or just a bug in Copilot's implementation.
"Find the root cause of this problem and explain it"
"Explain why the previous fix didn't work."
Often, it's best to undo the action and provide more context/tips.
Often, switching to Gemini 2.5 Pro when Claude is stumped helps a lot.