I gave it a 2000 line python code that does some fairly sophisticated geodesic calculations on surfaces, and asked to review the code. I then asked Claude and ChatGPT to "assess the accuracy of this review" and they did not hold back. That said, its a very small model, and very fast.