The competence profile of any LLM-based AI is extremely spiky - whether it does a particular task well or not is pretty independent of the (subjective) difficulty of the task. This is very different from our experience with humans.
slow was the safety net for sure but then there were errors too, there's a sweet equilibrium spot where ai + human oversight reaches that efficient + almost perfect situation. ofc with the right methods