it could be interesting to gauge how entwined the “how did you arrive at that answer” process is with the answering itself. i.e. which paths do they share? even at this early a stage: is there some structure which is used to determine the operand(s) that’s leveraged in both of these prompts? is the “how did you X” answer leveraging most of the “X” circuitry and just bailing out early? or does it deviate as early as post-tokenization?
philosophers would like to know.