Anything improving reasoning chains of though improves planning. Right now the long term ones Art mentioned like logging in have been around 80% while simpler ones have been higher. Right now our main issue is figuring out how to keep the server up :/ we're getting a little more traffic than expected. However, to bump those success rates up (which we need to) we really really need to fine tune additional models which we're planning out right now.
I have a few ideas around that mostly going down the RL route (with a twist) mixed with some knowledge graph work. We'll give an update when we push that!