Don't set your goals so low. We already reached 17k on a small models.
Since the whole goal of software architecture schemes it to allow the rest of us non-geniuses to still understand it and modify it, perhaps the same could be true of llms.
Perhaps a million-per-second hypothetical (small) model can be more useful than a state of the art big one.