undefined | Better HN

0 points__natty__4d ago0 comments

I wonder when we reach speed of 1000 tps with high quality models. 5 years? 10 years?

0 comments

Don't set your goals so low. We already reached 17k on a small models.

Since the whole goal of software architecture schemes it to allow the rest of us non-geniuses to still understand it and modify it, perhaps the same could be true of llms.

Perhaps a million-per-second hypothetical (small) model can be more useful than a state of the art big one.

goldenarm4d ago

We technically can (check Cerebras grok and Gemini diffusion), but it's not economically viable and not a priority for product managers.

Maybe when intelligence plateaus it could become a main differentiating factor, like smartphones and battery life.

j / k navigate · click thread line to collapse