It's still getting this wrong, for instance:
Prompt: You call heads/tails for 1000 coin tosses. You know the coin being used is weighted so that it comes up heads 52% of the time. Each correct coin toss prediction earns you $5. What should your strategy be, for calling heads/tails, in order to maximize how much money you receive?
ChatGPT: To maximize your money, you should call heads 52% of the time and tails 48% of the time, as those are the probabilities of getting heads and tails, respectively. This will give you an expected payout of $5 * 0.52 = $2.60 per head and $5 * 0.48 = $2.40 per tail, which is the highest expected value given the information about the coin being weighted.
It's getting closer to being able to prove that the square root of 2 is irrational, but it still flubs a step.
Prompt: Prove that the square root of 2 is irrational.
ChatGPT: ... This means that p can be expressed as 2r for some integer r. Substituting this into the equation above gives us:
ChatGPT: 2q^2 = (2r)^2 = 4r^2
ChatGPT: Dividing both sides by 4 gives us:
ChatGPT: q^2 = r^2
ChatGPT: So q must also be even, which means that both p and q have a common factor of 2...
It failed basic algebraic manipulation, but keeps going with the right reasoning simply because that's what it's been trained on, not because it's reflecting on the equation it just came up with. That's not math. That's mimicry.
Even if it's seen something many times in training, its accuracy gets worse with longer trains of "thought", as expected because a language model like that has ADHD. It can't focus enough on one logical thread, and other "knowledge" leaks in. Ask it to derive the quadratic formula by completing the square, which requires a lot of steps with accurate algebraic manipulation. It goes through some irrelevant algebraic steps, sometimes not even getting the algebraic manipulation correct, and finally (sometimes, depending on how you ask) manufactures the correct answer out of thin air as the last step. I finally got it to answer correctly, by prompting it with the initial equation and that the first step is to divide both sides by a, but it still combined a few steps, not going methodically a single step at a time as prompted.