Author here.
I don't expect you'd see a difference. For the simple case of iterating on a diagram you describe to ChatGPT, 4o is perfectly fine as-is. I can't see how you'd meaningfully improve on it.
However, for the advanced case of generating a diagram from a repo, AI isn't even close. You'd need practically unthinkable improvement for it to be useable.
The consequence, unfortunately, is incremental gains won't help in either case.