undefined | Better HN

0 pointssillysaurusx5y ago0 comments

Sure! I'd love to chat TPUs. There's a #tpu discord channel on the MLPerf discord: https://github.com/shawwn/tpunicorn#ml-community

The central reason that TPUs feel less flexible is Google's awful mistake in encouraging everyone to use TPUEstimator as the One True API For Doing TPU Programming. Getting off that API was the single biggest boost to my TPU skills.

You can see an example of how to do that here: https://github.com/shawwn/ml-notes/blob/master/train_runner.... This is a repo that can train GPT-2 1.5B at 10 examples/sec on a TPUv3-8 (aka around 10k tokens/sec).

Happy to answer any specific questions or peek at codebases you're hoping to run on TPUs.

0 comments

slaymaker19075y ago

That doesn't answer the question of what a TPU can do that a GPU can't. I think the OP means impossible for the GPU, not just slower.

j / k navigate · click thread line to collapse

0 pointssillysaurusx5y ago0 comments

Sure! I'd love to chat TPUs. There's a #tpu discord channel on the MLPerf discord: https://github.com/shawwn/tpunicorn#ml-community

Happy to answer any specific questions or peek at codebases you're hoping to run on TPUs.

0 comments

slaymaker19075y ago

That doesn't answer the question of what a TPU can do that a GPU can't. I think the OP means impossible for the GPU, not just slower.

j / k navigate · click thread line to collapse