I think it was with the anthropic guys on Dwarkeshs podcast, but really it could have been any of the other tech podcasts with a big name AI guest, but anyway they were talking about how orgs need to make big decisions about compute allocation.
If you need to do research, pre-training, RLHF, inference for 5-10 different models across 20 different products, how do you optimally allocate your very finite compute? Weight towards research and training for better future models, or weigh towards output for happier consumers in the moment?
It would make sense that every project in deepmind is in constant war for TPU cycles.