As in have the model consider its generated SVG, and gradually refine it, using its knowledge of the relative positions and proportions of the shapes generated, and have it spin for a while, and hopefully the end result will be better than just oneshotting it.
Or maybe going even one step further - most modern models have tool use and image recognition capabilities - what if you have it generate an SVG (or parts/layers of it, as per the model's discretion) and feed it back to itself via image recognition, and then improve on the result.
I think it'd be interesting to see, as for a lot of models, their oneshot capability in coding is not necessarily corellated with their in-harness ability, the latter which really matters.