Nvidia is actively building an entire deep learning stack internally, all the way to releasing a self-driving simulation platform which they are using to build their own self-driving software [1].
I think they are actually farther along and more aggressive about exploring deep learning use cases in production than Google today; augmenting real data with extensive simulation is really a far-reaching idea that comes directly from their gaming experience.
> So money is not an issue. It is tiny in the scheme of things.
Money of course is always an issue long term; otherwise why doesn't Google Fiber just spend tens of billions of dollars to build out its nationwide network? Because it will see negative ROI even if they succeed.
The TPU has to eventually make a real return to Google, and it won't if nvidia can spend the same amount of money and build a faster product and sell it to all the other cloud players, which I believe they definitely can.
Put another way, the TPU has to be cheaper to Google than buying nvidia GPUs after factoring in its development costs, whereas nvidia gets to amortize those dev costs over all other cloud providers and all other GPU customers. Google isn't about to sell the TPU to other cloud providers; the entire idea is to use it to drive Google Cloud adoption.
The TPU is a fine chip, but if you just look at the big picture, there is every sign that nvidia could build the same or better product for less money because it has far more synergies across the hardware and chip design stack; e.g. the TPU only has PCIe connectors, while nvidia has already worked with IBM to get NVLink into supercomputers [2]. For some workloads the TPU will likely be bandwidth-starved communicating with the CPU and main memory.
[1] https://nvidianews.nvidia.com/news/nvidia-introduces-drive-c...
[2] https://www.ibm.com/us-en/marketplace/power-systems-ac922/de...