Interestingly, many (most?) humans don't self-reflect or correct themselves unless challenged by an external agent as well — which doesn't necessarily have to be another human.
Also of note, GPT-4 seems to show huge improvements so far over GPT-3 when it comes to "thinking out loud" to come to a (better) answer to more complex problems. Kind of a front-loaded reflection of correctness for an overall goal before diving into the implementation weeds — something that definitely helps me (as a human) avoid unnecessary mistakes in the first place.