And so they decide to not disclose their own training information just after they told everyone how useful it was to get Deepseeks? Honestly can't say I care about "nearly as good as o1" when its a closed API with no additional info.
You can safely assume Qwen2.5-Max will score worse than all of the recent reasoning models (o1, DeepSeek-R1, Gemini 2.0 Flash Thinking).
It'll probably become a very strong model if/when they apply RL training for reasoning. However, all the successful recipes for this are closed source, so it may take some time. They could do SFT based on another model's reasoning chains in the meantime, though the DeepSeek-R1 technical report noted that it's not as good as RL training.
I don't remember the last time 20% of the HN front page was about the same thing. Then again, nobody remembers the last time a company's market cap fell by 569 billion dollars like NVIDIA did yesterday.
Source: https://x.com/Alibaba_Qwen/status/1884263157574820053
https://apnews.com/article/deepseek-ai-artificial-intelligen...
> [...] we are unable to access the proprietary models such as GPT-4o and Claude-3.5-Sonnet. Therefore, we evaluate Qwen2.5-Max against DeepSeek V3
"We'll compare our proprietary model to other proprietary models. Except when we don't. Then we'll compare to non-proprietary models."