gpt-3.5-turbo-1106 from November 2023 was 1170, 1206 is for the March variant.
Change that and you get ~84%, flip the order (i.e. the win rate of GPT-3.5 is ~16%). I.e. the point is a two year old model still wins far too often to be excited about each new top model for the last two years, not that the two year old model is better.