Definitely agree that a lot of work going into hyperparameter tuning and maturing the ecosystem will be key here!
I'm seeing the Mamba paper as the `Attention Is All You Need` of Mamba - it might take a little while before we get everything optimised to the point of a GPT-4 (it took 6 years for transformers but should be faster than that now with all the attention on ML)