I would not be surprised if someone soon announces a "JSON Transformation" tool that can convert one JSON schema to another. Followed shortly by a standard for JSON namespaces so you can mix different schemas, a standard for binary JSON, a standard for JSON-encryption, and so on.
"Those who cannot remember the past are condemned to repeat it."
CORBA is laughing its head off: "Who's bloated and overly complex now, eh?"
XML: "Well at least I developed an appreciation of the problem domain... Unlike those arrogant JSON kids"
CORBA: "You were the same way at their age."
XML: "Sure was, gramps!"
(Both chuckle)
Back in college, we tried our hand at using Ice http://www.zeroc.com/iceVsCorba.html and it seemed to do the trick.
Still, it does assume that both parties have access to the template generated by ice - thus going back to the issues surrounding needing a sort of schema.
That being said, there does seem to be options for interacting using json: http://ice2rest.make-it-app.com/ and as of 3.3.1 it apparently supports Google's Protocol Buffers for flexibility.
JSON schema:
Validing schemas is not as nearly important for JSON as it is for XML. JSON's relative simplicity the problem of "almost-but-not-quite" valid encoding mostly goes away.
As a result, not many people use JSON schema.
Even when I work with XML I very rarely come across code that performed actual XML validation. Most people would just wing it and hope nothing broke. That's the first dirty secret of XML validation.
The second dirty secret is that if you are consuming an API that provides invalid XML (a common occurrence), you just deal with it and try to make it work. XML validity be damned.
Collection+JSON:
Literally never heard of this. Don't need it either. Once the JSON is deserialized you can use the language's own tools (and a combination of lists/arrays + maps). So what's the point? I don't miss XPath or XQuery.
Siren:
An solution attempting to solve a non-problem copied from a solution from the XML world that didn't solve a problem either.
Not really, JSON is a simpler format with better parsing built-in for most languages. It is easier to use for programmers and performs better across the network.
This is as true today as it was true 7-years ago when I wrote this article: https://mkaz.com/2007/03/02/goodbye-xml-hello-json/
Until that whole leaky abstraction problem kicks in.
Haven't touched it in a few years, but I think the core idea is sound: as much schema as can be packed into 15 additional productions beyond the original 15 in the JSON spec.
I wrote a typeloader for Gosu based on it:
A core difference between it and your JSchema is that it, itself, is not JSON - just like XML, I don't think JSON makes for a good format to write down schema definitions. In fact, I don't think JSON is very human friendly at all[0], which is OK for a data interchange format, occasionally read by humans but hardly ever written by hand.
I did not further develop RELAX JSON, however, when I realized that TypeScript interface definitions[1] are a great JSON schema language:
enum Color { Yellow, Brown };
interface Banana {
color: Color;
length: number;
}
interface FruitBasket {
bananas: Array<Banana>;
apples: Array<{color: Color; radius: number}>?;
}
It's best to use interfaces and not classes, because interfaces allow for optional fields (with the `?` suffix), which is pretty common in APIs and wherever JSON is used.I will write a validate-JSON-with-TypeScript-type-definitions checker as soon as I find a need for it. Open to ideas here, guys! (or alternatives)
[0] Compare gruntfiles to webpack configs (tl;dr: they're JSON and JS-that-returns-an-object, respectively. the latter allows code, comments, unquoted keys, etc etc).
I'm not exactly guilty of that, but I do have some use cases, where one json document needs to be mapped to another one.
In a weak moment, I actually sketched a simple protocol for using javascript to express these "transformations": https://gist.github.com/miku/620aecc5ad782f261e3b
But I agree there is some pressure on JSON. And if someone can come up with a way to do schema and transformation that isn't complex, it will be adopted like crazy.
Counter-point: (1) all the cool kids use dynamic typing these days, and don't need schema (schema is a form of typing). (2) transformation is easy in fp (which the cool kids are also using), and don't need a separate transformation tool.
Uh, no. In general if you're sensible enough to use JSON as a data interchange format, you're probably sensible enough to use a real programming language to do transformation.
I agree with your counterpoints, but the cool kids are still having issues with transport representation of arbitrary types. Sure (eg Ruby) can use Kernel dump and load to marshall arbitrary types, but what happens when the other end doesn't have the type loaded or available?
Ouch, maybe we should invite Java Enterprise Beans to the party to comment on the semantics of distributed types?
JSON is currently deceptively simple precisely because its wire representation (with simple types) is equivalent to its type definition which can be directly evaled in js to produce a memory instantiation. But none of that holds in the general case... Try for example marshalling a JSON object with a tree structure.
Maybe we end up going in Alan Kay's direction, where we stop trying to send representations across the wire and just send code... But that too was tried in the past. It has other limitations.
It's complicated.
Thus any toolset that tries to address relationships, schemas, searching, and grouping in an XML/JSON type data format is going to be exponentially more complicated than RDBMSes and SQL.
So does mustache, with a bit less resistance/overhead.
So does Polymer if you make your xml tags into web components and apply them. It also has all of the benefits of Polymer's isolation.
I think xsl was and is a good idea. I just think that other things have come along that are easier to get into.
What are you even talking about? It seems like you think that XML and JSON are two mutually exclusive technologies and that if one is bad, the other is good. Or that if one is used, the other can't be? I don't understand why you even started talking about JSON; the article was about why XML and XSLT suck, not why JSON is superior to them.
I'm not really sure what point you're making. JSON is meant to make passing data in JavaScript easier. It's a necessary tool for interactive, JavaScript-heavy web frontends. I'm not sure why there's so much condescension in your post for JSON and its tooling. I'm just a bit confused; it comes off like you have a personal stake in XML and you find any mention of JSON, even obliquely, as offensive.
There's no reason that XML and JSON can't both be tools we use when the situation calls for it. This kind of dogmatic defense or condemnation of technologies offhand doesn't really do the HN community or the programmers at large any good.
No, its not.
On XSLT: find something that fills its role completely, with the same level of tooling, and then have a rant about inferior tools being popular, until then it doesn't really matter if it sucks.
I see this timeline all the time: 1. User posts article about xyz technology 2. User posts article negating post 1 3. Most boring shitstorm every occurs 4. karma karma karma
As a reddit refugee, I was hoping for a little more on HN.
Quote
> I'm not even talking about the hideously verbose syntax, or the completely obtuse data model. The fact that you can't know what any single line of code does without reviewing every other line in the program makes this language an abomination.
“XML is simply lisp done wrong.” – Alan Cox
It's not that XSLT isn't useful in some situations. It is. It's not that clean, simple and efficient XSLT is impossible. It is, but it's hard.
The fact that it isn't Turing complete can be a good thing. It can also cause a lot of headaches.
The main problem is that XSLT as designed and as implemented is an over-engineered god-awful mess. XSLT 2 was a huge improvement, but nobody implemented it, or they maybe only implemented bits of it in nonstandard ways (MSXML), so none of the better parts were reliable.
The idea of XSLT was sound and XPATH was pretty nice, but anyone who thinks XSLT is "good" probably has never worked on a large XSLT-based project (one where XSLT files routinely include other XSLT files and XML documents routinely link to other XML documents via xlink).
People say complexity gets out of control with OOP. Those issues pale into insignificance compared with rampant pattern matching split over many files when you have dozens of different schemas and are dealing with massive document graphs (with the occasional cyclic edge for good measure).
Good luck trying to reliably predict results in advance, or add any sort of control-flow logic to deal with edge cases without resorting to hard-coding and unrolling recursion.
It is a functional language, probably one of the reasons people don't like it.
Please don't insist on calling your bastard child of a language "functional."
It's a declarative language. It's functional language if you squint at it, sort of. Maybe. It certainly doesn't fit with the most broadly acceptable (stricter) definitions of what a functional language is.The biggest problem, with XSLT 1 at least (and that's primarily what I'm talking about because that's what exists in the wild), is that the output of a template function is a string, whereas the input is a node tree. So to get generalised recursion or perform function composition over the input data, you need to do some evil tricks.
To give the impression of Turing completeness, you need to first parse the input stream by splitting it up into strings and then call functions that perform string processing, which means you can no longer use xpath, or any of the things that really make XSLT. At that point you're not really writing XSLT anymore, you're doing simple recursive string processing using XSLT as a horrendously overweight and improperly-equipped wrapper around your native string libraries.
Incidentally, I don't accept the linked article as a proof of Turing completeness, since it only implements some trivial programs using Turing machines. That's not the same thing as proving equivalence with a universal Turing machine. However, others seem to have done it, presumably using tricks such as string processing, or ignoring the input altogether.
You could for instance, recreate Conway's Game of Life using pure XSLT, but it wouldn't contain any actual XML transformation code and all the state would exist in the running XSLT program, not the XML document. It would also blow the stack as most XSLT engines don't support tail recursion and there's no other way to loop indefinitely.
What you certainly can't do is have one function that outputs a set of XML nodes and another function that has a for-each that consumes that set, or have the ability to run apply-templates over it. That is nothing like functional programming.
It's about 30 lines of XSLT that run in the browser. [Edit: it's not 30 lines but 230, but I was thinking of the number of "rules" (templates) of which there are only 29.]
There are very few other tools of its kind and I don't think there exists any client-side, with the same simplicity. This attempt for example
https://github.com/domchristie/to-markdown/blob/master/src/t...
is about 180 lines of JS, is incomplete, doesn't work with many special cases, etc.
There is no better templating language than XSLT; every other templating approach (in PHP or Python on the server, in JavaScript on the client) feels like a horrible kludge once you've experienced XSLT.
Yes, XSLT is practically dead, that's a fact. But we should be very sad about it, instead of dancing on the coffin like the OP with its stupid quotes.
It looks like 234 lines of code to me.
>This attempt for example https://github.com/domchristie/to-markdown/blob/master/src/t....
Is badly written, but still written in a better language. They're using regexps to parse HTML (omfg!), but that kind of nastiness doesn't excuse XSLT as a language.
>There is no better templating language than XSLT
Except mako, jinja2, django templating language, liquid, etc.
"written in a better language" doesn't mean much, however. A better language for what? I'm not picking on JavaScript, which I love and use every day; but templating in JS versus XSLT is crazy.
The templating languages that you mention are, in my opinion, extremely complex and very unpalatable; and they only work server-side.
But that's all a matter of taste, I guess. What I don't understand is why so many people go out of their way to declare their hate of XSLT (and all things XML), especially now that XSLT is all but dead...?
I really like the computational model of XSLT (push vs. pull), it is so elegant. But it takes a quite some time to fully understand what is going on.
What I think is bad, is that the infrastructure for XSLT is not perferct. There is only one good XSLT 2 processor that I know of, everything else is XSLT 1.
I am currently eliminating some XSLT scripts with custom (Go) programs, because of speed issues.
1. parsed on server startup for setting up persistence, business rules, REST endpoints etc
2. transformed by XSLT to a) produce nice HTML documentation, including DOT class diagrams b) generate Java source code c) validate declaration integrity and cross-referencing
With the right XSDs, IDE support is excellent (auto complete for everything). Take the time to learn it, apply it according to your needs, and reap the benefits- in the long run, maintenance work is down by an order of magnitude.
In fact, the latter point is one of the reasons most people like to avoid XML.
And these are exactly those who will never "get" that XML is much more than a data container for tree-like structures. They should stick to JSON or CSV for that matter.
Mixed content in Json is definitely not as simple as it is in XML.
I understand writing XSLT and XML Schema can be difficult and I see how typing out XML namespaces can be a pain, but every sentence about XML in that article is a joke. Those quotes are all intended to be funny, not objective. Noone actually brought an objective facts against XML. Because they can't. The fact is it is widely used in many places. Anyone tell me an alternative to serialise an object tree where you also need to preserve ordering and type information, you need to store text longer than one line, or you just need to store any kind of formatting information. (and yes, you can use JSON to do that, but the resulting document will be 5x longer)
(meta: Funny quotes bashing useful technologies is the cat video equivalent of HN. Last week's article beating OOP was the same pattern.)
* XML is complicated enough that its parsers are commonly full of obscure bugs. JSON/YAML doesn't have this problem.
* XML is complicated enough that its parsers can have security vulnerabilities (e.g. see billion laughs for just one). JSON/YAML doesn't have this problem.
* XML is complicated enough that you can create an almost-but-not-quite valid encoding. The (already complicated enough) parsers have to deal with this and the ones that don't are considered broken. JSON/YAML doesn't have this problem.
* XML's complexity does not give you any additional benefit over YAML or JSON. Serializing/deserializing dates as strings is not a problem. It never was.
XSLT is just the shitty icing on the already crappy cake. A committee created a disastrous turing complete programming language to munge this already overcomplicated data format.
XML's complexity does not give you any additional benefit over YAML or JSON.
This is so incredibly wrong, on every level, that it belies belief and reads like something you would come across on a "beginning programmers" forum. As others have said, JSON/YAML thus far have seen limited usage (no, that configuration file on your app is not a complex example). But as it grows people are starting to ask questions like "Gosh, wouldn't it be nice if my perimeter or the source system via a metadata file could validate the JSON passed to us". "Wouldn't it be nice to be able to convert from one JSON form to another."
And the exact same complexity is arising...poorly, and with the same hiccups that the XML system went through.
I mean some of the comments are incredible. Like "JSON is simple enough that errors aren't big" -> Hey, sorry that those bank transfer got lost, but it turns out that we mistyped the account number field name and the destination system just ate it. Json.
¯\(°_o)/¯
Sorry that the dates are completely wrong, but all of those years of discovery about time zones and regional settings...just make it some sort of string and they'll figure it out.
¯\(°_o)/¯
Or your customer demands you to solve the wrong problem with XML.
A nice example: Simple configuration files which are best described as simply option=value or maybe json if someone wants to go really wild.
A customer comes and wants configuration files to be XML. Then your sales department agrees and now you have to implement XML files. The end result: Configuration files are no longer easily editable by humans. Yay!
Another example: Someone decided that using makefiles is too hard, so let's make the equivalent but with XML! I'm looking at you ant! Now they're still have the same problems as makefiles but they are much harder to edit.
* Configuration files are best made with YAML (it's the most human readable).
* APIs / other forms of serialization/deserialization over a network are best done with JSON (chop it in half and it will fail fast unlike yaml. still fairly readable tho).
* Programming languages (like ant) should not be written in either one ever (fortunately I've never heard of a YAML or JSON based language).
* XML does a bad to terrible job of all three.
When programming in XSLT it is great to fire up a debugger (let's say oXygen), run your transformation, click on the wrong output and being able to go step-by-step backwards.
How many languages designed before 1999 (yeah, XSLT is 15 years old) can claim to be able to do so?
Having written tranformations in Python that needed to carry that information... How do you do that in XSLT? (And do you think it's worth writing new code in XSLT?)
Sample:
Q. I'm designing my first DTD. Should I use elements or attributes to store data?
A. Of course. What else would you use?
[1] http://www.flightlab.com/~joe/sgml/faq-not.txtI personally like XML and XSLT (2.0) but to be able to work efficiently you need to spend some time learning which is not obvious on the first sight.
What about the alternatives?
JSON has a big advantage which is its unambiguous automapping to objects. This benefit is not that apparent in languages like Java where you'd still declare a class to represent either the XML or the JSON document. Moreover, there are projects which essentially try to bring schema and namespaces to JSON. JSON-LD is an example of a namespace without an explicit support in the underlying format. There is even a command-line tool jq big part of which is an engine similar to XPath.
S-expressions if used widely would probably go the same path as JSON - recreating a lot of what is considered as bloat in XML.
Another mentioned alternative was a custom text format. I assume the author meant just to design a format from scratch. I wrote that to use XML efficiently, you need to put in some work. But compared to making a backwards (and forwards?) compatible text format which correctly handles malformed and malicious input requires much more effort.
I don't know anything about ndb.
Our company once had an application that processed the end-of-year high school student results and then published them in various newspapers. The input files were text files generated from the Education Department's database from various regions. The process took around 10-15 minutes (lot's of rules had to be run against the data). I replaced it with a Windows JScript script and XSLT. It took 15 seconds to transform the data.
That said, I still use XSLT regularly but I'd be lying if I enjoyed working with it. Using a decent IDE for development and debugging can help.
Someone did give me a nice tip for working with and learning XLST though - "translate your transformation rules directly from simple English to the template rules".
eg. "I need to insert the node <member> under <group>"
<xsl:template match="group">
<group>
<member/>
<xsl:apply-templates/>
</group>
</xsl:template>
"But if the group node exist, doesn't create, then create it" <xsl:template match="root[not(group)]">
<root>
<group>
<member/>
</group>
<xsl:apply-templates/>
</root>
</xsl:template>Imagine LISP, but in the hands of Sauron or Palpatine. That's the XML group of technology.
Just in case you /do/ need to use XML every day: the primary benefit of XSLT is that it lets you avoid using XML libraries to munge some XML. Because the XML libraries are so horrendous to use from any language.
Whether you should be using it for the problem XSLT tries to solve is another question. Probably not, since there are other templating languages.
Unfortuntely running Javascript on the back-end for this purpose is not something most companies do.
Now if you want to start asking whether it solves the problem as well as today's competing technologies like JSON, at that point I will step back and hold my peace.
Just recently I had the chance to do a rewrite. What I did is I created "my own HTML". Basically every module of Twitter Bootstrap has it's own XSLT template. That means you have a very easy XML "HTML" syntax, but the output is Twitter Bootstrap. And every piece of output HTML is defined just once in the whole application so it's easy to maintain.
With the help of XSLT you can abstract a lot of things. One example: I have an element called <colgroup/>. It can contain up to 12 <col/> elements. if I set the @size attribute to one of the columns, the @size attribute for the others will be calculated automatically and the output matches the Twitter Bootstrap CSS classes.
I have to say, I love it. I can't imaging writing the whole mess of Twitter Bootstrap plain HTML in a template anymore.
XML is just death by overengineering
> My definition of powerful and elegant is lisp
Dude, XML is just s-exprs and XSLT is macros.XML might be overengeneered (which, except for a few things I don't agree with), but there is currently no alternative for it.
Using it for async transformations - html to pdf, customer message format to your message format. Fine.
Angle bracket overload, verbosity of end tags, library support, poor whitespace handling, namespace pain were all obstacles too but it was FP that made standard problems feel like math proofs and for developers to take days to solve problems they could code in minutes in their usual OO/imperative language.
When I see the pain FP causes in the real world I'm never quite sure whether its nature or nurture. I currently believe its a bit of both but the nature part will always hobble FP adoption - if you find algebraic proofs elegant, you will like FP. If you are "normal" and proving a theorem fills you with terror then you would prefer your programming language to resemble a a cookery recipe.
I also believe all templating, especially for code-generation, requires three brains - understanding the input data-structure, understanding the processing of the template and understanding the behaviour of the output. Each keystroke in your templating language has to be carried out with full understanding of all three parts. Its too much for those if they still struggle with more common two brain programming problems.
From the post I linked in my other comment
> Oh, and the fact that you can call a language functional when it lacks first class functions makes my eye twitch. I'm tempted to upload a video of my eye twitching just to prove it.
XSLT is referentially transparent (no setf for you) but withholds from you most of the goodies that people take for granted in functional or logic programming.
You could see it was written by well-intentioned FP enthusiasts. IMO the best alternative at the time when XSLT was developed would have been XMLPerl - embedding an imperative language in something that deals with the XML-specific parts appropriately. But Perl was never enterprisey enough, and XmlPerl died a painless death.
I've heard this before, but I don't find it to be true for me anyway. XSLT has never really clicked for me, while I really like Clojure and OCaml. Maybe the FP is part of the problem, but I also think XSLT is just a particularly obtuse functional language. XSLT makes it hard to figure out how to express even moderately complex algorithms (e.g. a map-reduce function is literally just that in Scheme, while I'm not sure I could write one correctly without several tries in XSLT) — and once you do express them, they're buried under an impenetrable mound of XML boilerplate that makes them hard to maintain or understand later.
Full disclosure: My last exposure to XSLT was a long, long time ago and I've been carefully avoiding it ever since.
In any case, we know that algebraists who program also use Maple, Magma, Mathematica, R, and Sage, or even straight Python, C, etc. FP languages are a minority even in the professional mathematics world.
Anyway, one of the PMs on the project insisted that this choice little hunk'O'hell communicate with the outside world -- at about 2K bits / sec on a good day -- using XML. Because it was standard. Because XML added to anything makes it better, no matter what it is. Because, well, nobody ever saw that traffic except other computers, but XML!
I wanted to kill, kill, kill, but instead I wrote an XML parser (kind of) that fit into about 1K of code. "Don't use entities or CDATA or namespaces," I said, and went away to a better place. I think the PM was happy. That group got sold to a company famous for its 60s-era radios and crappy teevees, and I assume everyone is happy there, because I have not heard a word from them, and XML!
One space that XSLT can demonstrate its strength is when transforming some horribly serialized interoperability data structure. If the system from which you are receiving data, produces terse XML, you aren't going to solve anything by rewriting the upstream system to produce equally lousy JSON. If you don't have the ability to fix the upstream service to produce better structured data, XSLT and XPath are wonderful tools to morph it into something more manageable. That transformation process is better written with XSLTs than it is in trying to do the same thing by slurping the data directly into some business object and trying then to work with a bad model. Don't go down the path of "garbage out, garbage in."
If you have access to both sides of the process, it might be worth rewriting the upstream system, but when working with a legacy system XSLT might be the best glue technology in your arsenal.
Assuming it was a simple transformation, the python parsing code could be under 10 lines. Most of what you wrote would be templating. It would be 98% declarative.
If it got complicated though (e.g. you're doing some aggregation or something), the python bit would grow but it probably never end up looking that horrendous, unlike XSLT.
The same pattern could be applied to many other language ecosystems. You just need to make sure you get the best XML parsing library and the best templating language.
var result = new XSLTProcessor().importStylesheet(xsl).transform(....);
It's a bit of a pointless example because it really depends on the transformations you need. I'm sure in some cases XSLT would be better for the job, and in other cases another language. Most of the time it would generally just depend on your environment, available tools and skillset. import xml.etree.ElementTree as ET
tree = ET.parse('data.xml')
new_tree = my_transform(tree)
new_tree.write("output.xml")>“XML combines the efficiency of text files with the readability of binary files”
One dirty thing about XML is that it appears human readable, but it is not human writable. You'll see something in the XML that you think you can change, but now it doesn't validate anymore after you change it using a text editor. You need an XML editor that understands XML validation to make edits. If I cannot use a basic text editor, it's not basic text. If it's not basic text, it's no better than any other binary protocol, albeit a very inefficient one.
I wonder how s/he does handle that pesky ';'
If the alternative look like a (t/n)roff file, I'd gladly take XML, though.
"New tool - cool! Lets use it on EVERY problem."
And are thus misused.XSLT exists for inter-organizational data transfer and transformation. Don't use it for any other situation.
XML is a good (not perfect) persistent data storage mechanism where you need the data to outlast the program that created it.
I go into more explanation here: http://sworddance.com/blog/2014/12/06/xml-is-not-bad-just-mi...
Lets not blame a tool that was misused.
Don't just downvote me. Challenge me!
Edit: Maybe downvote & challenge me :-) ?
Unfortunately there were also a bunch of bad points that never got fixed. Breaking changes for pluggins exhausted the small contributors community. I think the project is basically dead at this point and I've moved to another cms for small sites. (Keystone.js)
All in all, if i were to rewrite symphonycms now, i would drop xslt in favour of jade or something less anoying to writte in.
EDIT: I've been browsing symphonycms repo after writting this and it's untrue to say the project is dead since Brendo is still actively commiting.
Turns out it is not. Things can fail, really badly even and they do fail, really often even, even when there are companies with big pockets standing behind it and then good luck debugging the mess. This is true even for really proven technologies, but might be less obvious on those.
At some point when you are good at some technology, even if it's really popular, mainstream, be it some big SQL database (yes, all of PG, SQLite and MySQL even though I love some of those), C, C++, Java, C# or Python/Ruby/Perl/Node.js you will constantly end up with the implementation of the underlying technology.
I am not saying XSLT or any of the above don't have their use cases, but actually that a lot of them are over-engineered. Using most of these technologies I know there are issues, send bug reports and patches and hey, things get fixed really quickly. That's all good and you never can fully avoid these things, but the more simpler you get the less there is that end up biting you and causing you to start out with lets say you having your SASS to CSS compiler having an issue, digging deeper through every library finding a GCC bug or whatever. Such things happen.. more often than one would think. So based on developer pay and the issues it causes (often being a blocker for a whole team) that's actually a really big problem.
And increasing the probability that you invent the wheel, badly. I have been using libxml and libxslt for years and as far as I remember I never encountered a bug. Both projects have been developed for years and is used by a gazillion other projects.
It is many times more likely that you will be bitten by a bug in your own custom configuration file parser than e.g. in libxml2.
I am not arguing for or against XML, but code reuse. Simplicity also means not reinventing the wheel and keeping your own projects simple by leveraging existing, good, libraries.