2Mamba Explained: The State Space Model Taking On Transformers (opens in new tab)(kolaayonrinde.com)270koayon2y ago93
3The Frontier of Adaptive Computation in Machine Learning (opens in new tab)(github.com)1koayon2y ago0
4DeepSpeed's Bag of Tricks for Training Large Models (opens in new tab)(kolaayonrinde.com)1koayon2y ago0