Have you tried it with something like OpenSpec? Strangely, taking the time to lay out the steps in a large task helps immensely. It's the difference between the behavior you describe and just letting it run productively for segments of ten or fifteen minutes.