My perspective on the problem is that the way twitter's medium is designed doesn't produce a wonderful signal:noise ratio. Some people might revel in the noise, and some amount of noise certainly generates MAUs and ad dollars, but ultimately it's not living up to its potential as a medium and social network. 'A clown car that fell into a gold mine' still applies.
My theory of change is that in general, features that improve SNR are going to provide a sustaining MAU advantage to products that implement them, because the vast majority of users want better SNR. They want quality discourse, funny memes, kind interactions, etc. Drama and conflict also produce MAUs, but ultimately they're net negative because of the people they scare off.
If you accept that premise, then the question is more when than if features focused on improving the SNR of interactions should be built. Do they meaningfully differentiate and help with the cold start problem? Or are they to be built after some amount of audience has arrived?
From what I've seen on the clones, as they grow they're all running into the same SNR ceiling. I think that ceiling is preventing them from being breakout successes, because whatever differentiator they started with (decentralized, free speech, etc) will fade in relevance as they get bigger and their core experience reverts to the mean. They're just twitter, but smaller.
Would a product that focused more on improving SNR be able to breakout? I don't know, I'm just disappointed nobody seems to have tried very hard.