undefined | Better HN

0 pointsfalcolas3y ago0 comments

Personal rant: I've never understood the hate towards XML. It's a practical and flexible markup language that does not depend on whitespace or quoting every bloody thing. Plus, every markup language invented since has had to re-invent XML things that, surprise surprise were actually needed. Paths, schemas, comments, etc.

I get frustrated with JSON because of things I could do in XML that I can't do in JSON without breaking the spec. And that doesn't even cover how annoyingly verbose anything done in JSON is.

Anyways.

0 comments

25 comments · 12 top-level

no_wizard3y ago· 1 in thread

The bottom line, from my experience is this:

XML means you have to think about how you're architect information exchange much more thoughtfully, because XML is much more strict and follows very explicit rules. It also means you have to develop a schema for your interchange.

With JSON, you can, more or less, just serialize as is, not much thought in the world, and it can be understood by the client. You're not obligated by the protocol to do anything about developing a schema or anything of that nature.

Yes, parts of the XML world have now bled over (defined schema and path algorithms being the big ones I think) but they're optional for better or worse.

Everyone takes the path of least resistance when given the opportunity at the end of the day. This is especially true of software engineers I've found.

gmfawcett3y ago

Eh, not really. You have the option of defining schemas, of course, but it's not required.

In practice, I think most of us used XML the same way that people use JSON today: here's some data from my app, figure out how to pull out the bits you need. No high ceremony required.

gen2203y ago· 7 in thread

I think people dislike XML precisely because it's so flexible and unopinionated. It means that when you're parsing a new XML source, you have to look for data that might be encoded in two or three little niches. God help you if it's encoded in each of them, and if the data therein is contradictory. Basically, it's easy to "hold it wrong" in a way that harms consumers.

falcolasOP3y ago

> I think people dislike XML precisely because it's so flexible and unopinionated.

You're probably right.

> Basically, it's easy to "hold it wrong" in a way that harms consumers.

I feel a bit like this could equally apply to things like GraphQL. I spend more time reading the docs on how someone's internal data model is built than I do writing GraphQL queries. And if I get that data model slightly wrong, my query's garbage.

But that's probably a separate, vendor specific (coughnewrelic), rant.

gen2203y ago

Every "contract" language (e.g. .graphql) I've used so far has it's ups and downs.

One that keeps biting us in GQL, for instance, is that it won't let you define a union type for mutation inputs, so you end up needing N methods like `GetByX(X)`, `GetByY(Y)`, `...` instead of a single `Get(X|Y|...)`. Not to mention support for versioning message types, etc...

FWIW: I think proto3 is probably the best I've used, at length and in production. Granted it has its "warts", but the idioms to circumvent them are fairly well-documented and agreed upon, even if they're a bit "ugly" in the syntactical sense.

FWIW pt 2: Like yourself, I think, I would not consider JSON to be appropriate as a schema-defining "contract" language in 2022, for a company that plans to be around in 5 years. There are too many better options available.

solarkraft3y ago

Over-engineering (relative to the complexity of the task you want to solve).

Enterprise solutions are often so complex and generic that they can theoretically do and interoperate with anything, but are also hard to get started with and use well.

People like to start with simple things and expand from there instead of buying a whole house when they just want a sink (am I using the phrase right?). In my opinion this makes a lot of because it avoids unnecessary complexity and sensitizes people to why some complexity is in fact necessary.

mind-blight3y ago

I think it also got horribly abused by some use cases. People tried to allow programming in it (e.g. if statements and loops) when an actual DSL would have been a better solution.

I know that was my last straw before I started assuming that anything that required XML would be painful to work with. It wasn't an entirely fair assumption, but it was correct often enough that I used it as a helpful heuristic.

lenkite3y ago

A schema can be written in 15-30 mins to make XML as opinionated as you want it to be.

Spivak3y ago

I think the other smell is not using serializers with XML. If when you consume a document you’re not getting an actual useful object back of course you’re going to hate it.

adgjlsfhk13y ago

and then you'll read an xml file that doesn't meet your schema and your code will crash

2 more replies

dwaite3y ago

Formats are designed for a purpose.

XML was built for a very limited set of purposes - to create a common base markup for various document formats. It was almost purpose-built to create something like XHTML - a mixed content system where you can do semantic decoration of textual media content, and extend it with things like SVG and MathML.

The problem is that it was _not_ built as a cross-platform data structure interchange format, but it wound up being used for that way more than its intended purpose. This was partially because of the extensibility story - companies could agree on a common base format, and define their own extensions to add additional data. However this was a pain - the XML tooling was often generic to support both kinds of usage, and the language itself was ambiguous because the expectation that the underlying document being described would have document-specific clarifications on use and tooling.

JSON is an object notation - it is a way to transmit hierarchal data. It has limited extensibility in the sense that you can define rules for data processing, such as 'ignore things you don't understand' or 'name things which are not agreed upon with URI rather than short names'.

Trying to use JSON to represent the content model of HTML will just cause pain, because thats not what it was built for. It isn't even re-inventing things from XML, it is just cramming a square peg in a round hole.

Neither format was built to be a configuration file format for users to hand-edit config. As a result, they suffer limitations in their syntax and features (closing elements in XML, quoted property names and lack of comments in JSON being the most commonly cited). TOML is one popular choice for this sort of use case.

wvenable3y ago· 4 in thread

In Java-land, XML became a replacement for actual code. And for that, it's kind of terrible.

But programs that store their data as XML rather than some proprietary binary or text format are awesome.

vips7L3y ago

I’ve never written XML in Java in the past 7 years of being a developer.

wvenable3y ago

It's not the way Java is done now because, as noted, it was awful. Java is almost 30 years old now. Developers who worked in Java 20 years ago had a pretty different experience than it is now.

However, I'm sure a lot of that code still exists.

1 more reply

pwinnski3y ago

In Spring-land, I think you mean.

I enjoy Java. I do not enjoy Spring.

brazzy3y ago

Using XML became optional in Spring as soon as Java got annotations.

Which was in Java 5.

18 years ago.

acdha3y ago

It’s very similar to Java: there’s a not bad language buried under huge layers of bad practice and design by committee which most people are stuck using, with a side serving of market failure around the developer experience (e.g. Maven, Spring on the Java side, tools being built on the orphaned libxml meaning that you’re stuck in the early 2000s level of XPath, etc. in many cases).

XML 1.0 was decent but I soured on most of the “standards” based on it after too many iterations of chasing through a chain of specs pulled in by reference where you had to read a bunch of ponderous W3C documents and non-working examples to learn that the spec authors hadn’t correctly modeled the problem, nobody had time to work on any of this, and the only extent implementations either weren’t compatible or had a lot of tedious workaround code. Bonus points if they were replacing a legacy format and ended up with a result which still required deep familiarity with the original format but was also much less efficient.

These problems are cultural. JSON certainly isn’t immune to this but the Java/XML world has more people who felt the need to LARP as Very Serious Architects designing extremely expensive systems. Things like Atom show grownups can use XML, too, but they’re notably the exception.

In Java, the most direct counterparts I see are the places where people felt like they should copy the Sun library developers for code which is far less universally used and took on a huge support cost building abstractions and customization points which were largely unused, often only for security exploits.

winstonewert3y ago· 1 in thread

> that does not depend on whitespace

This is precisely one of my complaints against XML, it does depend on whitespace. In JSON, I know that any excess formatting whitespace can safely be removed. But excess formatting whitespace is part of the document in XML and I can't know in general whether it can be safely removed or not.

sixbrx3y ago

Right, it requires a schema to even know whether whitespace is to be considered significant. And schemas are a whole nother can of worms...

cptskippy3y ago

Java/.NET have first class support for XML and SOAP, most other languages don't. It's too easy to use XML in those languages. Other languages don't have that tooling or support and it's just a slog.

lliamander3y ago

Once upon a time I was deeply in the world of XML Schema, XSLT, and XQuery. It was actually pretty cool.

But using complex, poorly specified, possibly Turing-complete "config" files written in a markup language that isn't the primary language your app is written in is a serious code smell.

It means you would either be better off using an embedded scripting language (like Groovy) or a better core language.

tannhaeuser3y ago

I think Spring, Maven (and ant before), JSP, and J2EE descriptors gave XML a bad name, as fields of application where a markup language wasn't adequate. XML was meant as a simplification of SGML to formalize and extend HTML on the web, but was overhyped as serialization format for everything and anything for all the wrong reasons, among them that XML sold well to management.

It can be argued that, despite not a primary use case for markup, XML has found a useful niche in b2b service payloads and government, banking, and health services in particular. The use case for those might have been "web services" in the original sense where simple CSS-like transforms and styles are applied to payloads for display in browsers, but the JSON community hasn't brought forward a serious replacement for XML Schema, so XML payloads kindof keep sticking around in long-term projects.

silvestrov3y ago

xml also scales much better with deeply nested documents (such as html documents).

A deeply nested JSON document is difficult to navigate in. Even with prettyprint it is counting indentation to find out what kind of info this level has.

Take the html page for news.ycombinator.com and convert the html document into JSON format. It becomes unreadable.

AnimalMuppet3y ago

You're comparing XML and JSON, and you're complaining about how verbose JSON is? Um, JSON is, like, half as verbose as XML is.

aeze3y ago

You consider JSON to be more verbose than XML? I'm curious to hear why.

j / k navigate · click thread line to collapse