> That's good thinking for cases where you have a single toolset, in which tools can be kept in sync to collaborate with one another.
This is interesting. I would actually kind of argue the exact opposite, that more rigorously defined formats are more important the more diverse your toolsets get, and less important the less diverse they are.
The whole point of having a rigorously defined data format that blocks certain validation errors at the data level is that it's easier for diverse toolsets to work with that data, because they don't need to all implement their own validators, and they don't need to worry as much about other tools accidentally sending them malformed/broken data.
> making the data format simple
I think where we might be disagreeing is that I argue more specific data formats that inherently block validation errors are simpler than vague formats where there are restrictions and errors you can make, but those restrictions aren't clearly documented and aren't obvious until after you try to import the data.
I would point to something like the Matrix specification -- they have put comparatively more work into making sure that the Matrix specification (while flexible) is consistent, they don't want clients randomly making a bunch of changes or assumptions about the data format. That's partially inspired by looking back at standards like Jabber and seeing that having a lack of consensus about data formats caused tools to become extremely fragmented and hard to coordinate with each other. See https://news.ycombinator.com/item?id=17064616 for more information on that.
My feeling is that when you introduce validation layers, you have not actually gotten rid of restrictions between user applications, and you have not actually made coordination simpler, because different tools are going to break when they see pieces of data that they consider invalid or that they didn't realize they needed to be able to handle. All that's really happened is that complexity has been moved into the individual applications and that logic has been duplicated across a bunch of different apps.
In contract, when every single tool is speaking the same language and agrees what is and isn't valid data, then it's very fast to build tools that you know will be compatible with everything else in the ecosystem.