It will almost never converge on the general solution that will pass tests you haven't given it yet.
This is why AI is sooo good at Javascript and related slop. A solution that "kinda works" is good enough 9 times out of 10 and if some tests fail well ... YOLO and the web page will probably render anyway.
Contrast that to using Scheme or Lisp where AI will have trouble simply keeping the parentheses balanced.