The difference in progress in smaller models is far more impressive.
Compare Gemini 3.5 Flash to a ~16B parameter model from 24 months ago.
Compare GPT-5.5 to a frontier model 24 months ago.
Yes, GPT-5.5 got better. At orders of magnitude smaller parameter sizes (when factoring in ACTIVE parameters) the increase is far more pronounced.