I interact with tools and with people. When with people, there's a shared understanding of the goal and the context (aka, alignment as some people like to called it). With tools, there's no such context needed. Instead I need reproducible results and clear output. And if it's something that I can automate, that it will follow my instructions closely.
LLMs are obviously tools, but their parameters space is so huge that's it's difficult to provide enough to ensure reliable results. With prompting, we have unreliable answers, but with agents, you have actions being made upon those reliable answers. We had that before with people copying and pasting from LLMs output, but now the same action is being automated. And then there's the feedback loop, where the agent is taking input from the same thing it has altered (often wrongly).
So it goes like this: Ambiguous query -> unrealiable information -> agents acting -> unreliable result -> unreliable validation -> final review (which are often skipped). And then the loop.
While with normal tools: Ambiguous requirement -> detailed specs -> formal code -> validation -> report of divergence -> review (which can be skipped) . There are issues in the process (which give us bugs) but we can pinpoint where we did wrong and fix the issue.