In my experience, this sort of thing nearly works... But never quite works well enough and errors and misunderstandings build at every stage and the output is garbage.
I had hoped that this recursive breakdown approach could remove the need for bigger and bigger monolithic LLM for ever bigger tasks, by allowing every tasks to be at same granularity, but... I guess I should just try building one myself.