Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
Muon Is Scalable for LLM Training
(opens in new tab)
(github.com)
5 points
renonce
1y ago
1 comments
Share
Muon Is Scalable for LLM Training | Better HN
1 comments
default
newest
oldest
yorwba
1y ago
For people who want to know more about the Muon optimizer:
https://kellerjordan.github.io/posts/muon/
j
/
k
navigate · click thread line to collapse