> while the silicon closest to its capabilities needs more like tens of kW.
I think looking at power consumption for the very edge of what technology is just barely capble of may be misleading, since that's inherently at one extreme of the current cost-capability trade-off curve[0] and stands to drop the most drastically from efficiency improvements.
You can now run models equivalent in capability to initial version of ChatGPT on sub-20w chips, for instance. Or, looking over a longer timeframe, we can now do far more on a 1-milliwatt chip[1] than on the 150kW ENIAC[2].
[0]: https://i.imgur.com/GydBGRG.png
[1]: https://spectrum.ieee.org/syntiant-chip-plays-doom
[2]: https://cse.engin.umich.edu/about/history/eniac-display/