Both changes were trivial: it had one incorrect (but unnecessary) import, and it used the syntax from Lean 3 instead of Lean 4 in one lambda definition. A system that was trained harder on Lean would not make those mistakes.
The one actual error it made was in not proposing that the other direction of the "if and only if" is required. Again, I am quite confident that this formalisation failure mode is not hard to solve in a system that is, like, actually trained to do this.
Obviously formalising problems that a working mathematicican solves is dramatically harder than formalising IMO problems, and is presumably way ahead of the state of the art.
Why?
IDK, even for translation between languages outside of mathematics, missing small qualifiers that change the whole meaning of the sentence is a worrying failure mode I've seen, and with mathematical problems there are a lot more cases like that.