Maybe chess.com players play in a specific way, and there are lot of transcriptions of such games that these LLMs have ingested when they ingested the internet?
I don't know why it worked in this specific case, but based on earlier examples it is more likely that these kind of games were more prevalent in its dataset it was trained on than it being able to play chess in general. It still wasn't perfect, so even these games weren't rigid enough for it to reliably perform valid moves.