I get frustrated with JSON because of things I could do in XML that I can't do in JSON without breaking the spec. And that doesn't even cover how annoyingly verbose anything done in JSON is.
Anyways.
XML means you have to think about how you're architect information exchange much more thoughtfully, because XML is much more strict and follows very explicit rules. It also means you have to develop a schema for your interchange.
With JSON, you can, more or less, just serialize as is, not much thought in the world, and it can be understood by the client. You're not obligated by the protocol to do anything about developing a schema or anything of that nature.
Yes, parts of the XML world have now bled over (defined schema and path algorithms being the big ones I think) but they're optional for better or worse.
Everyone takes the path of least resistance when given the opportunity at the end of the day. This is especially true of software engineers I've found.
In practice, I think most of us used XML the same way that people use JSON today: here's some data from my app, figure out how to pull out the bits you need. No high ceremony required.
You're probably right.
> Basically, it's easy to "hold it wrong" in a way that harms consumers.
I feel a bit like this could equally apply to things like GraphQL. I spend more time reading the docs on how someone's internal data model is built than I do writing GraphQL queries. And if I get that data model slightly wrong, my query's garbage.
But that's probably a separate, vendor specific (coughnewrelic), rant.
One that keeps biting us in GQL, for instance, is that it won't let you define a union type for mutation inputs, so you end up needing N methods like `GetByX(X)`, `GetByY(Y)`, `...` instead of a single `Get(X|Y|...)`. Not to mention support for versioning message types, etc...
FWIW: I think proto3 is probably the best I've used, at length and in production. Granted it has its "warts", but the idioms to circumvent them are fairly well-documented and agreed upon, even if they're a bit "ugly" in the syntactical sense.
FWIW pt 2: Like yourself, I think, I would not consider JSON to be appropriate as a schema-defining "contract" language in 2022, for a company that plans to be around in 5 years. There are too many better options available.
Enterprise solutions are often so complex and generic that they can theoretically do and interoperate with anything, but are also hard to get started with and use well.
People like to start with simple things and expand from there instead of buying a whole house when they just want a sink (am I using the phrase right?). In my opinion this makes a lot of because it avoids unnecessary complexity and sensitizes people to why some complexity is in fact necessary.
I know that was my last straw before I started assuming that anything that required XML would be painful to work with. It wasn't an entirely fair assumption, but it was correct often enough that I used it as a helpful heuristic.
XML was built for a very limited set of purposes - to create a common base markup for various document formats. It was almost purpose-built to create something like XHTML - a mixed content system where you can do semantic decoration of textual media content, and extend it with things like SVG and MathML.
The problem is that it was _not_ built as a cross-platform data structure interchange format, but it wound up being used for that way more than its intended purpose. This was partially because of the extensibility story - companies could agree on a common base format, and define their own extensions to add additional data. However this was a pain - the XML tooling was often generic to support both kinds of usage, and the language itself was ambiguous because the expectation that the underlying document being described would have document-specific clarifications on use and tooling.
JSON is an object notation - it is a way to transmit hierarchal data. It has limited extensibility in the sense that you can define rules for data processing, such as 'ignore things you don't understand' or 'name things which are not agreed upon with URI rather than short names'.
Trying to use JSON to represent the content model of HTML will just cause pain, because thats not what it was built for. It isn't even re-inventing things from XML, it is just cramming a square peg in a round hole.
Neither format was built to be a configuration file format for users to hand-edit config. As a result, they suffer limitations in their syntax and features (closing elements in XML, quoted property names and lack of comments in JSON being the most commonly cited). TOML is one popular choice for this sort of use case.
But programs that store their data as XML rather than some proprietary binary or text format are awesome.
However, I'm sure a lot of that code still exists.
XML 1.0 was decent but I soured on most of the “standards” based on it after too many iterations of chasing through a chain of specs pulled in by reference where you had to read a bunch of ponderous W3C documents and non-working examples to learn that the spec authors hadn’t correctly modeled the problem, nobody had time to work on any of this, and the only extent implementations either weren’t compatible or had a lot of tedious workaround code. Bonus points if they were replacing a legacy format and ended up with a result which still required deep familiarity with the original format but was also much less efficient.
These problems are cultural. JSON certainly isn’t immune to this but the Java/XML world has more people who felt the need to LARP as Very Serious Architects designing extremely expensive systems. Things like Atom show grownups can use XML, too, but they’re notably the exception.
In Java, the most direct counterparts I see are the places where people felt like they should copy the Sun library developers for code which is far less universally used and took on a huge support cost building abstractions and customization points which were largely unused, often only for security exploits.
This is precisely one of my complaints against XML, it does depend on whitespace. In JSON, I know that any excess formatting whitespace can safely be removed. But excess formatting whitespace is part of the document in XML and I can't know in general whether it can be safely removed or not.
But using complex, poorly specified, possibly Turing-complete "config" files written in a markup language that isn't the primary language your app is written in is a serious code smell.
It means you would either be better off using an embedded scripting language (like Groovy) or a better core language.
It can be argued that, despite not a primary use case for markup, XML has found a useful niche in b2b service payloads and government, banking, and health services in particular. The use case for those might have been "web services" in the original sense where simple CSS-like transforms and styles are applied to payloads for display in browsers, but the JSON community hasn't brought forward a serious replacement for XML Schema, so XML payloads kindof keep sticking around in long-term projects.
A deeply nested JSON document is difficult to navigate in. Even with prettyprint it is counting indentation to find out what kind of info this level has.
Take the html page for news.ycombinator.com and convert the html document into JSON format. It becomes unreadable.