MTIA v1's specs: The accelerator is fabricated in TSMC 7nm process and runs at 800 MHz, providing 102.4 TOPS at INT8 precision and 51.2 TFLOPS at FP16 precision. It has a thermal design power (TDP) of 25 W. Up to 128 GB of ram LPDDR5.
Googles Cloud TPU v4: 275 teraflops (bf16 or int8), 90/170/192 W. 32 GiB of HBM2 RAM, 1200 GBps. From here: https://cloud.google.com/tpu/docs/system-architecture-tpu-vm...
So it seems that the Google Cloud TPU v4 has an advantage in terms of compute per chip and ram speed, but the Meta one is much more efficient (2x to 4x, it is hard to tell) and has more ram but it is slower ram?
>We found that GPUs were not always optimal for running Meta’s specific recommendation workloads at the levels of efficiency required at our scale. Our solution to this challenge was to design a family of recommendation-specific Meta Training and Inference Accelerator (MTIA) ASICs.
You come up with a clever ASIC that is better than their current GPU for your workload… and by the time it comes out they’ve released the next year’s chip that just has like 50% more memory bandwidth or something ridiculous like that, and beats you by pure grunt.
“No replacement for displacement” actually seems to be true in compute.
Some companies definitely played games and mined with the asic's themselves (and then shipped those used asic's)... but in general, it was always a lot more profitable to sell the shovels than it was to mine the gold.
Is it primarily for inference and the training is just an after thought?
Each PE is equipped with two processor cores (one of them equipped with the vector extension) and a number of fixed-function units that are optimized for performing critical operations, such as matrix multiplication, accumulation, data movement, and nonlinear function calculation. The processor cores are based on the RISC-V open instruction set architecture (ISA) and are heavily customized to perform necessary compute and control tasks.
Side note, the chip says Korea on it & I this expected it was Samsung... But it's TSMC made chips? What's up with that?
So 2 generation of immediate improvement available.
Meta is going to use it in datacenters, Much more efficient than NVidia generic GPUs. They are serious about putting AI everywhere.
Amazing times! Private companies now have compute resources previously only showing up in government labs, and in many cases using novel components like MTIA
This feels like the start of a golden age and in a few years we will have incredible results and breakthroughs