undefined | Better HN

0 pointsjsheard1y ago0 comments

That's the elephant in the room with the reasoning/COT approach, it shifts what was previously a scaling of training costs into scaling of training and inference costs. The promise of doing expensive training once and then running the model cheaply forever falls apart once you're burning tens, hundreds or thousands of dollars worth of compute every time you run a query.

0 comments

Workaccount21y ago

They're gonna figure it out. Something is being missed somewhere, as human brains can do all this computation on 20 watts. Maybe it will be a hardware shift or maybe just a software one, but I strongly suspect that modern transformers are grossly inefficient.

Legend24401y ago

Yeah, but next year they'll come out with a faster GPU, and the year after that another still faster one, and so on. Compute costs are a temporary problem.

1 more reply

j / k navigate · click thread line to collapse

0 comments

Workaccount21y ago

Legend24401y ago

Yeah, but next year they'll come out with a faster GPU, and the year after that another still faster one, and so on. Compute costs are a temporary problem.

1 more reply

j / k navigate · click thread line to collapse