Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
7777777phil
1mo ago
0 comments
Share
Cool hack but 0.5 tok/s on 70B when a 7B does 30+ on the same card. NVIDIA's own research says 40-70% of agentic tasks could run on sub-10B models and the quality gap has closed fast.
0 comments
No comments yet.