The way to achieve interoperability while still being able to innovate is to follow the model of browsers (HTML, CSS and JavaScript). The process works something like this:
1. Everyone has the same basic functionality, you can switch between browsers and (mostly) everything works the same way, renders the same way and so on
2. Browser XYZ decides they want a cool feature where they expose data from a fingerprint reader as a JS API
3. Browser XYZ implements said feature and sees if websites starts using it.
4. Standards-bodies might start noticing that the feature was implemented in Browser XYZ and keeps an eye on it
5. If a second browser implements a similar/the same feature (although slightly different API or other incompatibility), standards bodies starts working on creating a standard for said feature, together with Browser XYZ and the others who participate in the standards organization
6. Once standard is done, reviewed and published, the browsers who want to have the feature go back and adjust/add/remove things until they comply with the standard.
Obviously, it's not exactly like this, but the process is more or less like this.
It's not hard to imagine the same for messaging services. The base-layer is that everyone can send text messages to everyone. This we can all agree on, so a standard would be for that first.
Then if some messaging service wants to add a new feature, they start working on that and deploying it for their service. If a second messaging service deploys the same feature, standard bodies should work on getting a interoperable model of that feature, that then all messaging services can use and hence work across all of them.