JSON Feed (opens in new tab)

(jsonfeed.org)

486 pointsfold9y ago215 comments

215 comments

> JSON Feed files must be served using the same MIME type — application/json — that’s used whenever JSON is served.

So then it's JSON, and I'll treat it as any other JSON: a document that is either an object or an array, that can include other objects or arrays, as well as numbers and strings. Property names doesn't matter, nor do order of properties or array items, or whatever values are contained therein.

Please don't try to overload media types like this. Atom isn't served as `application/xml` precisely because it isn't XML; it's served as `application/atom+xml`. For a media type that is JSON-like but isn't JSON, you may wish to look at `application/hal+json`; incidentally there's also `application/hal+xml` for the XML variant.

Or as someone else rightly suggested, consider just using JSON-LD.

anshou-9y ago

It's worth pointing out that any valid JSON value is a valid JSON document. There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

"I am a valid JSON document. So is the Number below, and in fact every line below this line."

null

jameshart9y ago

Actually actually... the JSON spec doesn't define the concept of a JSON document. Neither http://www.json.org/ nor http://www.ecma-international.org/publications/files/ECMA-ST... actually specifies that a JSON 'document' is synonymous with a JSON 'value'.

Now it's also true that JSON doesn't specify an entity that can be either an object or an array but not be a string or a bool or a number or null. So it's kind of true that JSON doesn't say that an object or array are valid root elements.

But JSON also says "JSON is built on two structures" - arrays and objects. It defines those two structures in terms of 'JSON values'. But it's a reasonable way to read the JSON spec to say that it defines a concept of a 'JSON structure' as an array or object - but not a plain value. And then to assume that a .json file contains a JSON 'structure'.

Basically... JSON's just not as well defined a standard as you might hope.

edit: And now I'm going to well actually myself: Turns out https://tools.ietf.org/html/rfc4627 defines a thing called a 'JSON text' which is an array or an object, and says that a 'JSON text' is what a JSON parser should be expected to parse.

So - pick a standard.

niftich9y ago

JSON is in fact defined in (at least) six different places, as described in the piece 'Parsing JSON is a Minefield' [1] (HN: [2]).

The problem is perhaps not as egregious as with "CSV" -- which is more of a "technique" rather than a format, despite after 30 years of customary usage, someone retroactively having written a spec; but it does manifest in various edge cases like we're debating.

[1] http://seriot.ch/parsing_json.php [2] https://news.ycombinator.com/item?id=12796556

d0mine9y ago

Why are you referencing the obsolete rfc? There is no restriction to object/array for the JSON text in the current rfc https://tools.ietf.org/html/rfc7159

2 more replies

paulddraper9y ago

> There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

Alas, if only that were true.

RFC 4627:

> A JSON text is a serialized object or array. The MIME media type for JSON text is application/json.

RFC 7159:

> A JSON text is a serialized value. Note that certain previous specifications of JSON constrained a JSON text to be an object or an array. Implementations that generate only objects or arrays where a JSON text is called for will be interoperable in the sense that all implementations will accept these as conforming JSON texts.

IIRC, Ruby's JSON parser was written to be strictly RFC 4627 compliant, and yields a parser error for non-array non-object texts.

Since JSON isn't versioned so no one has any idea what "JSON" really means, or what "standard" is being followed.

mstade9y ago

You're right, thanks for the correction! Also kind of reinforces my point I feel. That any JSON document is just that, a JSON document; it doesn't carry more semantics just because you say so. My JSON parser will still just see simple JSON values, no matter how much I tell it that a certain key should really be a URL, not just a string.

tghw9y ago

True, but that's also true of any XML, RSS, Atom, HTML, etc. Websites abuse HTML all the time, and there's nothing saying that just because something is transferred with application/atom+xml that it will be valid or follow the spec.

It's more of a social agreement. If you get a JSON object from a place you expect a JSON Feed and it has a title and items, then it'll probably work, even if it omits other things.

1 more reply

debaserab29y ago

Beware that many JSON parsers don't agree with this, although your interpretation is the correct interpretation of the spec. Some parsers will only accept either an array or object. If you're building a JSON endpoint you'll be safest returning either an array or object.

niftich9y ago

true

hyperpallium9y ago

false

eriknstr9y ago

Absolutely agree about the MIME type.

Someone filed an issue and created a pull-request for this after you wrote this comment.

https://github.com/brentsimmons/JSONFeed/issues/22

https://github.com/brentsimmons/JSONFeed/pull/23

I hope they will merge it.

mstade9y ago

That's great, thanks for sharing!

rspeer9y ago

Should this web page have been served as `text/hacker-news-comment-thread+html`?

niftich9y ago

No. HTML is not formally recognized to be a 'Structured Syntax' of upon which semantically richer standalone mediatypes can be built [1]. This is because existing deployments favor a different approach of imbuing additional semantics inside HTML documents -- microformats -- which place the mechanism of understanding on an opportunistic parser, vs. a restrained one that only executes if its preferred mediatype is advertised. Appendix A of RFC 3023 [2] offers a thorough treatment of this matter. Not defining +html is essentially a concession that enables the two schools of thought to coexist side-by-side.

This is the same difference in schools that I express in a different comment [3] in this thread.

[1] https://tools.ietf.org/html/rfc6839 [2] https://tools.ietf.org/html/rfc3023#appendix-A [3] https://news.ycombinator.com/item?id=14361842

eridius9y ago

No, because this web page isn't using a specialized format.

But if it were using XHTML, then the proper mime type would be application/xhtml+xml.

foota9y ago

It seems that all this spec is, is a structure for an api response. I don't see why it should have a different media type.

martin-adams9y ago

I don't believe it should be just application/json because it's a specific format of json. There could be multiple json representations of the feed other than jsonfeed that the server supports and the client could define which ones they Accept.

So the server could support all of the following:

application/jsonfeed

application/rss+xml

application/atom+xml

Who knows, maybe RSS and ATOM could be represented in JSON and have the following mime types:

application/rss+json

application/atom+json

If it's just an API response, and it is your API for an application called Widget Factory, then you can, if you want, have your own format:

application/vnd.widgetfactory+json

Generally, defining such a mime type should have some specification describing it otherwise no client can reliably implement a compatible client. Jsonfeed have proposed that specification.

tootie9y ago

Well, if JSON had namespaces or standard validation framework, we could have that conversation.

vog9y ago

You mean, like JSON schema?

http://json-schema.org/

Not sure why you want to emulate XML namespaces in JSON, but JSON schemas can include other JSON schemas and extend upon other JSON schemas. That accounts for 99.9% of the use cases for namespaces.

mstade9y ago

That's my point though – it doesn't have anything to describe metadata, so therefor trying to cram in additional semantics is futile if you still want to call it JSON. Call it something else and you can attach whatever semantics you'd like, but they think it should be served up as `application/json` which means all those semantics go out the window.

mindcrime9y ago

Do we really need this? Atom is fine for feeds. Avoiding XML just for the sake of avoiding XML, because it isn't "cool" anymore is just dump groupthink.

If this industry has a problem, it's FDD - Fad Driven Development and IIICIS (If It Isn't Cool, It Sucks) thinking.

pfranz9y ago

Part of me is with you. But even in established languages I've had trouble finding an appropriate xml parser and had to tweak them way more than I thought necessary. I haven't (yet) had that problem with JSON.

I think with something like feeds there's the possible benefit of becoming a 'hello world' for frameworks. Many frameworks have you write a simple blogging engine or twitter copycat. I don't think I've ever seen that for a feed reader/publisher. People have said that Twitter clients were an interesting playground for new UI concepts and paradigms because the basics were so simple (back when their API keys were less restrictive). Maybe this could be that?

mindcrime9y ago

But even in established languages I've had trouble finding an appropriate xml parser and had to tweak them way more than I thought necessary. I haven't (yet) had that problem with JSON.

Maybe it's just that I work mostly with JVM languages (Java, Groovy, etc.) but I haven't had any problems with handling XML - including Atom - in years. But I admit that other platforms might not have the same degree of support.

2 more replies

dabeeeenster9y ago

What language are you using that doesn't have a working XML parser? REALLY?

1 more reply

djur9y ago

XML parsers have a pretty bad track record for security vulnerabilities. If I was writing code to distribute that was going to be parsing arbitrary data from third parties (which is the RSS/Atom use case), I would be more comfortable trusting the average JSON parser than the average XML parser.

Otherwise, I agree with the "if it ain't broke" principle. There's also cases where so much ad hoc complexity is built on top of JSON that you end up with the same problems XML has, except with less battle-tested implementations.

tetrep9y ago

As terrible as XML parsers can be, they've never been as bad as "XMLdoc = eval(XMLString)". I'd be more likely to trust a JSON parser not written in JavaScript than an arbitrary XML parser, but that's only because of the XML specification itself, which includes such features as including arbitrary content as specified by URLs (including local (to the parser) files!). Great ideas when you can trust your XML document, not so great otherwise.

1 more reply

armandososa9y ago

It is very likely than I am an idiot, but I've always found parsing XML too hard, specially compared to JSON which is almost too easy.

martijndwars9y ago

Whether parsing XML is easy or hard, how often do you actually write an XML parser? If I'm digesting a JSON/XML document, I resort to a parser library for the language that I'm using at that point, so the complexity of writing such a parser is pretty much non-existent. Definitely not a compelling reason to switch to JSON.

1 more reply

mstade9y ago

If there was an `XML.parse` just like there's `JSON.parse`, I doubt you'd say the same. As it stands, the added complexity in JS-land is to import a library that provides this functionality for you. Fortunately there are many, but I agree a built-in would be nice. It's a bit of a shame that E4X never landed in JS.

2 more replies

chc9y ago

You're basically saying that this isn't technically better, just more socially acceptable right now. I think you're right, but it seems to me that Atom's problem is primarily a social one. So even if this doesn't carry any technical advantages, a format with a strong social "in" is precisely what we need to make feeds a thing again.

hayleox9y ago

To be honest, I'm really excited about the prospect of JSON based feeds. Right now, there's no easy way to work with Atom/RSS feeds on the command-line (that I know of anyway), which is something I often wish I could do. With a JSON feed, I can just throw the data at jq (https://stedolan.github.io/jq/) and have a bash script hacked together in 10 minutes to do whatever I want with the feed.

falcolas9y ago

I give you libxml:

    xmllint --xpath '//element/@attribute'

There's a good chance it's already installed on your mac.

1 more reply

chriswarbo9y ago

There are a few nice XML processing utilities. I tend to use xmlstarlet and/or xidel. This lets me use XPath, jQuery-style selectors, etc.

I agree that jq is really nice though. In particular, I still find JSON nicer than XML in the small-scale (e.g. scripts for transforming ATOM feeds) because:

- No DTDs means no unexpected network access or I/O failures during parsing

- No namespaces means names are WYSIWYG (no implicit prefixes which may/may not be needed, depending on the document)

- All text is in strings, rather than 'in between' elements

- No redundant element/attribute distinction

Even with tooling, these annoyances with XML leak through. As an example, xmlstarlet can find the authors in an ATOM file using an XPath query like '//author'; except if the document contains a default namespace, in which case it'll return no results since that XPath isn't namespaced.

This sort of silently-failing, document-dependent behaviour is really frustrating; requiring two branches (one for documents with a default-namespace, one for documents without) and text-based bash hackery to look for and dig out any default namespace prior to calling xmlstarlet :(

http://xmlstar.sourceforge.net

http://www.videlibri.de/xidel.html

Animats9y ago

I have an RSS client written in Rust that builds as a command line program.[1] I wrote this in 2015, and it needs to be modernized and made a library crate, but it will build and run with the current Rust environment. It's not that hard to parse XML in Rust. Most of the code volume is error handling.

[1] https://github.com/John-Nagle/rust-rssclient

sillysaurus39y ago

Surely there's an xml->json converter somewhere.

duskwuff9y ago

It's kind of tough to convert XML directly to other formats (including, but not limited to, JSON), because there are a lot of XML features that don't map cleanly onto JSON, such as:

• Text nodes (especially whitespace text nodes)

• Comments

• Attributes vs. child nodes

• Ordering of child nodes

1 more reply

matthewaveryusa9y ago

Once you've peeked at the complexity of some of the xml parsers (like xerces, oh god xerces) undoubtedly you'll want to avoid it like the plague. xml can get crazy-bananas very quickly. I fundamentally don't understand xml (just like I don't understand asn1) for anything beyond historical purposes.

aloisdg9y ago

The Atom spec is really easy to grasp. Your platform may even include a way to deal with it ([.NET](https://msdn.microsoft.com/en-us/library/system.servicemodel... for example)

mindcrime9y ago

There are definitely complexities in the XML ecosystem, like XLink, schemas, namespaces, etc. But in practice, not every application needs all that stuff, and when using the "common" parts of XML, I don't find it difficult to understand or work with. But that's just me.

oefrha9y ago

Yep, we don't really need another syndication format that no reader is going to support or support well for years. All I see missing in RFC 4287 is the lack of a per-entry cover image/thumbnail, which you can solve with an extension (which no one supports, and that's kind of the point) anyway.

revelation9y ago

Yeah this is great, now instead of properly machine-readable and verifiable XSD files we have pseudo-RFC text on some shitty GitHub page.

robgibbons9y ago

JSON, given the same schema, will always be more efficient byte-for-byte than XML. In addition, JSON as a format is native to JavaScript, which itself is ubiquitous. That's not even mentioning raw readability/writability.

Basically, XML is to JSON as SOAP is to REST. It had it's day, though it's obviously still useful, but we have better tools now. Frankly, I'm surprised we haven't seen a proposal like this sooner.

stephenr9y ago

> XML is to JSON as SOAP is to REST

That's true. Both XML and SOAP are well defined, and well structured.

JSON and REST are both marginally defined, and thus we see constant incompatible/incomplete implementations, or weird hacks to overcome the shortcomings.

> we have better tools now

I think "the cool kids are cargo-culting something newer now" is probably more accurate.

1 more reply

liuyanghejerry9y ago

Part of me is also with you - JSON is indeed smaller than XML , but we do have gzip almost everywhere around the web, and with gzip, they don't have that much difference on space. Also, if people really care about this, why don't they use binary format, such as something like protobuf?

And the other part of me is not with you - manipulating XML is not as easy as JSON in most of my development time, and sometimes I even need to write something by my bare hands, which JSON is much more handy. Tons of other formats are more human-friendly than JSON, for example TOML, but they don't have the status JSON has. So I guess JSON is kinda choice under the current state of "web development times".

hyperpallium9y ago

In practice, json is much easier to work with on the command line because of jq.

wcummings9y ago

No, we don't. This doesn't do anything except break compatibility.

weberc29y ago

Yikes, you didn't even make it to the second sentence.

> JSON is simpler to read and write, and it’s less prone to bugs.

mindcrime9y ago

JSON is simpler to read and write, and it’s less prone to bugs.

I don't actually find either of those things to be true.

2 more replies

jwilk9y ago

From the HN guidelines:

Please don't insinuate that someone hasn't read an article.

1 more reply

metheus9y ago

We don't prefer JSON to XML for any reason other than that XML is terrible by comparison.

dangerlibrary9y ago

It's funny to me that at the same time people are flocking to languages with strong, flexible type systems (often with compile-time checks), we are fleeing from a strongly typed data interchange format in favor of a dynamic bag of objects and arrays.

2 more replies

bdr9y ago

That's not a reason.

russellbeattie9y ago

For anyone who's tried to write a real-world RSS feed reader, this format does little to solve the big problems the newsfeeds have:

* Badly formed XML? Check. There might be badly formed JSON, but I tend to think it'll be a lot less likely.

* Need to continually poll servers for updates? Miss. Without additions to enable pubsub, or dynamic queries, clients are forced to use HTTP headers to check last updates, then do a delta on the entire feed if there is new or updated content. Also, if you missed 10 updates, and the feed only contains the last 5 items, then you lose information. This is the nature of a document-centric feed meant to be served as a static file. But it's 2017 now, and it's incredibly rare that a feed isn't created dynamically. A new feed spec should incorporate that reality.

* Complete understanding of modern content types besides blog posts? Miss. The last time I went through a huge list of feeds for testing, I found there were over 50 commonly used namespaces and over 300 unique fields used. RSS is used for everything from search results to Twitter posts to Podcasts... It's hard to describe all the different forms of data it can be contain. The reason for this is because the original RSS spec was so minimal (there's like 5 required fields) so everything else has just been bolted on. JSONFeed makes this same mistake.

* An understanding that separate but equal isn't equal. Miss. The thing that http://activitystrea.ms got right was the realization that copying content into a feed just ends up diluting the original content formatting, so instead it just contains metadata and points to the original source URL rather than trying to contain it. If JSONFeed wanted to really create a successor to RSS, it would spec out how to send along formatting information along with the data. It's not impossible - look at what Google did with AMP: They specified a subset of formatting options so that each article can still contain a unique design, but limited the options to increase efficiency and limit bugs/chaos.

This stuff is just off the top of my head. If you're going to make a new feed format in 2017, I'm sorry but copying what came before it and throwing it into JSON just isn't enough.

hboon9y ago

FWIW, This is by Manton Reece and Brent Simmons. And Simmons is known (among other things) as the creator of NetNewsWire which has been around for more than 15 years. He does know a bit about Atom and RSS feeds.

https://en.wikipedia.org/wiki/NetNewsWire

nothrabannosir9y ago

Ok, I have no idea who these guys are so forgive me being rude: if they're so good then why did they not address those points? to my eyes, op makes a solid argument. I'd like to know their side of the story.

1 more reply

toyg9y ago

One has to wonder whether Simmons is just trying to revive the old RSS ecosystem. "What do developers like these days, JSON? Let's do RSS in JSON!" ... This does not help.

The real challenge these days is to replicate the solutions Facebook and Twitter brought to feeds (bidirectionality and data-retention in particular) in a decentralised manner that could actually become popular. Simply replicating RSS in the data-format du jour is not going to achieve that.

lucideer9y ago

> Need to continually poll servers for updates? Miss. Without additions to enable pub sub, or dynamic queries, clients are forced to use HTTP headers to check last updates, then do a delta on the entire feed if there is new or updated content.

This is backwards, imo. The advantage of polling over pub sub is that all complexity is offloaded to the client. This comes with its own set of problems (inefficiency of reinventing the wheel across all clients, plus every client will implement that complexity differently resulting in countless bugs), but this is what drives adoption, which as someone else here has pointed out is all that matters. If you want adoption, you seemingly need to sacrifice a lot of efficiency in favour of making it stupidly easy to publish.

The "it's 2017 now" argument doesn't really address that even with dynamically generated content, you still need every dynamic serverside platform to adopt and implement your spec independently. Static is always easier. (plus with the recent trend towards static sites, "it's 2017 now" actually has the opposite implication).

icebraining9y ago

Plus, you can always reuse PubSubHubBub (now WebSub[1]), which is already used in RSS/Atom feeds to provide optional subscribing to updates if both the server and client support it.

[1] https://www.w3.org/TR/websub/

mindcrime9y ago

The thing that http://activitystrea.ms got right was the realization that copying content into a feed just ends up diluting the original content formatting, so instead it just contains metadata and points to the original source URL rather than trying to contain it.

It's a shame that ActivityStrea.ms hasn't had more uptake. We've added support in our enterprise social network product and think it enables some cool scenarios. But unfortunately too few other products support it these days.

derefr9y ago

> Need to continually poll servers for updates? Miss.

The point of these syndication formats (RSS, Atom, now this) was always to act as the "I'm a static site and webhooks don't exist, so poll me" equivalent of webhooks. These "pretending to be webooks" were supposed to hook into a whole ecosystem of syndication middleware that turned the feeds into things like emails.

And that—the output-products of the middleware—was what people were supposed to consume, and what sites were meant to offer people to consume. The feed, as originally designed, was not intended for client consumption. That's why the whole model we have today, where millions of "feed-reader" clients poll these little websites that could never stand up to that load, seems so silly: it wasn't supposed to be the model. RSS feeds were supposed to be a way for static-ish content to "talk to" servers that would do the syndicating for them; not a format for clients to receive notifications in.

(And we already had a format for clients to receive notifications in: MIME email. There's no reason you can't add another MIME format beyond text/plain and text/html; and there's no reason you can't create an IMAP "feed-reader" that just filters your inbox to display only the messages containing application/rss+xml representations, and set up your regular inbox to filter out those same messages. And some messages would contain both representations, so you'd see e.g. newsletters as both text in your email client and as links in your feed client, and archiving them in one would do the same in the other, since they're the same message.)

---

The big problem I have with feeds (besides that people are using them wrong, as above) is that they have no "control-channel events" to notify a feed-consumer of something like e.g. the feed relocating to a new URL.

Right now, many feeds I follow just die, never adding a new feed item, and the reason for that is that, unbeknownst to me, the final item in the feed (that I never saw because it rotted away after 30 days, or because I "declared inbox zero" on my feeds, or whatever else) was a text post by the feed's author telling everyone to follow some new feed instead.

And other authors don't even bother with that; they use a blogging framework that generates RSS, but they're maybe not even aware that it does that for them, so instead they tell e.g. their Twitter followers, or Twitch subscribers, that they're moving to a new website, but their old website just sits there untouched forever-after, never receiving an update to point to the new site which would end up in the RSS feed my reader is subscribed to. It's nonsensical.

(And don't get me started on the fact that if you follow a Tumblr blog's RSS feed, and the blog author decides to rename their account, that not only kills the feed, but also causes all the permalinks to become invalid, rather than making them redirect... Tumblr isn't alone in this behavior, but Tumblr authors really like renaming their accounts, so you notice it a lot.)

ttepasse9y ago

HTTP 301 Moved Permanently is the out of band control channel. Sometimes it even seems to work, depending on software of course.

There was also a typical Dave-Wineresque invention of replacing the old feed with some special, non-namespaced XML with the redirect: http://www.rssboard.org/redirect-rss-feed

But of course the real problem is social. As in people simply stop blogging or stop caring. And of course tool developers don't care if someone doesn't want to use their software anymore and don't think of developing the right buttons for this edgecase.

1 more reply

CharlesW9y ago

Dave Winer (the creator of RSS) played with this a bit in 2012. It turns out that exact format of feeds doesn't matter nearly as much as there being a more-or-less universal one.

http://scripting.com/stories/2012/09/10/rssInJsonForReal.htm...

AceJohnny29y ago

I'm sure there's...

oh of course: https://xkcd.com/927/

(and I realize this doesn't exactly map, as JSON Feed isn't even trying to cover all the usecases of Atom or RSS, just switching the container format)

gedrap9y ago

But does it solve any actual problems other than 'XML is not cool', problems big enough to deserve a new format?

It's true that JSON is easier to deal with than XML. But that's relative, there are plenty of decent tools around RSS. From readers, to libraries in the most common programming languages, and extensions in the most common content management systems. JSON is slightly easier to read for human (although that's subjective), but then how often do you need to read the RSS feed manually, unless you are the one who is writing those libraries, etc. But that's a tiny share of all people using RSS.

>>> It reflects the lessons learned from our years of work reading and publishing feeds.

Sounds like the author(s) has extensive experience in this field and knows things better than some random person on the internet (me). But the homepage of the project doesn't convey those learned lessons.

tannhaeuser9y ago

Yes JSON is much easier to parse than XML, and is preferred when it fits such as for most Web API requests and responses.

However, SGML and XML were invented as structured markup languages for authoring of rich text documents by humans, for which JSON is unsuited and sucks just as much as XML sucks for APIs.

Edit: though XML has its place in many b2b and business-to-government data exchanges (financial and tax reporting, medical data exchange, and many others) where a robust and capable up-front data format specification for complex data is required

zeveb9y ago

If we're going to talk about replacing XML with better data formats, why not switch to S-expressions?

    (feed
     (version https://jsonfeed.org/version/1)
     (title "My Example Feed")
     (home-page-url https://example.org)
     (feed-url https://example.org/feed.json)
     (items
      (item (id 2)
            (content-text "This is a second item.")
            (url https://example.org/second-item))
      (item (id 1)
            (content-html "<p>Hello, world!</p>")
            (url https://example.org/initial-post))))

This looks much nicer IMHO than their first example:

    {
        "version": "https://jsonfeed.org/version/1",
        "title": "My Example Feed",
        "home_page_url": "https://example.org/",
        "feed_url": "https://example.org/feed.json",
        "items": [
            {
                "id": "2",
                "content_text": "This is a second item.",
                "url": "https://example.org/second-item"
            },
            {
                "id": "1",
                "content_html": "<p>Hello, world!</p>",
                "url": "https://example.org/initial-post"
            }
        ]
    }

krapp9y ago

It looks nicer if you happen to like s-expressions. But to me, it's just replacing one flavor of clutter for another. The best reason not to prefer s-expressions to JSON, though, would be simply that one is already natively supported in browsers and the other would need a parser written in a language that already parses JSON.

JoelSanchez9y ago

There's EDN, which is to Clojure what JSON is to JS: a format close to the language's way of representing data.

https://github.com/edn-format/edn

Example:

https://github.com/milikicn/activity-stream-example/blob/4db...

Not S-expression-based, though.

draegtun9y ago

JSON was influenced by Rebol, which i feel would provide an even nicer example:

    version: https://jsonfeed.org/version/1
    title: "My Example Feed"
    home-page-url: https://example.org
    feed-url: https://example.org/feed.json
    items: [
        [
            id: 2
            content-text: "This is a second item."
            url: https://example.org/second-item
        ]
        [
            id: 1
            content-html: "<p>Hello, world!</p>"
            url: https://example.org/initial-post
        ]
    ]

hajile9y ago

Those two aren't comparable because you cannot distinguish between key:val pairs and lists. You need dotted lists.

zeveb9y ago

Nope, one would normally write the code which parses such S-expressions such that the first atom in each list indicates the function to use to parse the rest of the list. So there'd be a FEED-FEED function which knows that a feed may have version, homepage URL &c., and there'd be a FEED-ITEMS function which expects the rest of its list to be items, a FEED-ITEMS-ITEM function which knows about the permissible components of an item &c.

If you really want to do a hash table, you could represent it as an alist:

    (things
      (key1 val1)
      (key2 val2))

This all works because — whether using JSON, S-expressions or XML — ultimately you need something which can make sense of the parsed data structure. Even using JSON, nothing prevents a client submitting a feed with, say, a homepage URL of {"this": "was a mistake"}; just parsing it as JSON is insufficient to determine if it's valid. Likewise, an S-expression parser can render the example, but it still needs to be validated. One nice advantage of the S-expression example is that there's an obvious place to put all the validation, and an obvious way to turn the parsed S-expression into a valid object.

1 more reply

Johnny_Brahms9y ago

There is of course sxml, which is more or less well-defined. But that would probably suffer from the same problems as xml, since it is just xml written as s-expressions.

There is one pretty damn solid SSAX parser by Kiselyov that has been ported to just about every scheme out there. It is interesting since it doesn't do the whole callback thing of most ssax parsers, but is implemented as a tree fold over the xml structure.

foxhill9y ago

beauty is in the eye of the beholder - i personally prefer the JSON.

Communitivity9y ago

It is worth pointing out that there is a relevant W3C Recommendation "JSON Activity Streams", https://www.w3.org/TR/activitystreams-core/ . I'm not saying JSON Feed is worse, or better. I am saying that I think JSON Feeds adoption requires a detailed comparison between JSONFeed and JSON Activity Streams 2.0.

smilbandit9y ago

Thanks +1, didn't know that.

eric_the_read9y ago

A few thoughts on the spec itself:

* In all cases (feed and items), the author field should be an array to allow for feeds with more than one author (for instance, a podcast might want to use this field for each of its hosts, or possibly even guests).

* external_url should probably be an array, too, in case you want to refer to multiple external resources about a specific topic, or in the case of a linkblog or podcast that discusses multiple topics, it could link to each subtopic.

* It might be nice if an item's ID could be enforced to a specific format, even if perhaps only within a single feed. Otherwise it's hard to know how to interpret posts with IDs like "potato", 1, null, "http://cheez.burger/arghlebarghle"

derefr9y ago

> a podcast might want to use this field for each of its hosts, or possibly even guests

I'm going to pretend this is about music artists in a music library, but the logic is exactly the same for podcast hosts:

You tend to want fields like this to be singular, so that the field can be used in collation (i.e. "sort by artist.")

If you have multiple artists for a track, usually one can be designated the "primary" artist—the one that people best know, and would expect to find the track listed under when looking through their library. Usually, then, the rest get tacked on in the field in a freeform, maybe comma-and-space delimited fashion. The field isn't a strict strongly-typed references(Person) field, after all; it's just freeform text describing the authorship.

But as for hosts vs. guests, that's a whole can of worms. Look at the ID3 standard. Even though music library-management programs usually just surface an "Artist" field, you've actually got all of these separate (optional) fields embedded in each track:

• TCOM: Composer

• TEXT: Lyricist/Text writer

• TPE1: Lead performer(s)/Soloist(s)

• WOAR: Official artist/performer webpage

• TPE2: Band/orchestra/accompaniment

• TPE3: Conductor/performer refinement

• TPE4: Interpreted, remixed, or otherwise modified by

• TENC: Encoded by

• WOAS: Official audio source webpage

• TCOP: Copyright message

• WPUB: Publishers official webpage

• TRSN: Internet radio station name

• TRSO: Internet radio station owner

• WORS: Official internet radio station homepage

That gives you separate credits for pretty much the entire composition, production and distribution flow, which usually means that each field only needs one entry.

Would be great if people used them, wouldn't it? Maybe the semi-standard "A feat. B (C remix)" microformat could be parsed into "[TPE2] feat. [TPE1] ([TPE4])"...

smilbandit9y ago

I was thinking that all the urls and images should have been in arrays.

eric_the_read9y ago

Probably, but I think the goal there is to have something that you can display on a summary page with a list of items or episodes, where there's just an icon for each (and a banner image for the background or header or some such), for which purpose I think a single image is fine (I totally get your wanting more than one, though, and I'm happy to be wrong here).

jerf9y ago

I would suggest specifying titles as html, not plain text. I've seen too many things titled "I <i>love</i> science!" over the years to believe in the idea that titles are plain text.

Also, despite the fact this is technically not the responsibility of the spec itself, I would strongly suggest some words on the implications of the fact that the HTML fields are indeed HTML and the wisdom of passing them through some sort of HTML filter before displaying them.

In fact that's also part of why I suggest going ahead and letting titles contain HTML. All HTML is going to need to be filtered anyhow, and it's OK for clients to filter titles to a smaller valid tag list, or even filter out all tags. Suggesting (but not mandating) a very basic list of tags for that field might be a good compromise.

ergothus9y ago

Allowing HTML means the other side will have to validate that HTML (to avoid XSS). Using text means you can stick in the DOM using innerText() and be much more confident that you aren't injected XSS.

I agree that I see HTML in RSS titles, but I rather have the occasional garbled title that the author can fix by striping out HTML before the RSS than ensuring that every RSS reader isn't opening up new security holes.

jerf9y ago

There is no way to avoid having to handle HTML safely. There's no point in trying to limit your exposure to that problem when the entire point of this standard is to ship around arbitrary HTML for interfaces to display. Once you've solved the hard problem of displaying the body safely, displaying the title is trivial. Making the title pure text does nothing useful. JSONFeed display mechanisms that are going to get this wrong are going to do things like leave injections in the date fields anyhow.

ianburrell9y ago

Following the separation of content_text and content_html attributes, it would make sense to have title_html and title_text attributes.

jawns9y ago

> It's at version 1, which may be the only version ever needed.

Wow. Now that's confidence. Have you ever read the first version of a spec and thought, "That's just perfect. Any additional changes would just be a disappointment compared with the original"?

Johnny_Brahms9y ago

MIDI 1.0 is maybe not perfect, but it is still unchanged since 1983. People have tried to replace it for 2 decades, but failed to provide any enhancements worth a switch.

But MIDI doesn't really fit that description since it builds on 2 years of work by Roland. My best bet though.

vanderZwan9y ago

In all fairness, they're taking a more or less solved problem (feeds), so they don't really have to figure things out there, and they're porting this established solution to a very well-established technology (JSON), so also don't really have to figure stuff out in that sense either.

As far as scenarios where it's feasible to get the answer right the first time go, this is a reasonably realistic one.

EDIT: Also, if you scroll to the bottom of the page you can see they have let a whole bunch of people look at the spec before releasing it, so there has been at least some peer review.

efsavage9y ago

Unsurprising as this is clearly an ego play, given that the first thing they want you to know is their names.

smacktoward9y ago

"Now it belongs to the ages!"

pimlottc9y ago

> JSON is simpler to read and write, and it’s less prone to bugs.

Less prone to bugs? How's that?

bastawhiz9y ago

Consider XML entity bombs. You need to explicitly tell your XML parser not to follow the spec to prevent malicious sources of XML from crashing your application. XML also has a lot of room for syntax errors, with many types of tokens and escape rules. JSON, by comparison, does not.

jjawssd9y ago

> XML also has a lot of room for syntax errors, with many types of tokens and escape rules. JSON, by comparison, does not.

Parsing JSON is a minefield.

Yellow and light blue boxes highlight the worst situations for applications using the specified parser. Take a look at how a bunch of parsers perform with various payloads: http://seriot.ch/json/pruned_results.png

"JSON is the de facto standard when it comes to (un)serialising and exchanging data in web and mobile programming. But how well do you really know JSON? We'll read the specifications and write test cases together. We'll test common JSON libraries against our test cases. I'll show that JSON is not the easy, idealised format as many do believe. Indeed, I did not find two libraries that exhibit the very same behaviour. Moreover, I found that edge cases and maliciously crafted payloads can cause bugs, crashes and denial of services, mainly because JSON libraries rely on specifications that have evolved over time and that left many details loosely specified or not specified at all."

More details available at: http://seriot.ch/parsing_json.php

2 more replies

josteink9y ago

> XML also has a lot of room for syntax errors,

No it doesn't. XML is either well formed or not, and any parser encountering non well-formed XML will reject it outright.

Therefor all XML in use on the internet is spec-compliant.

Now try to say the same about JSON.

bastawhiz9y ago

> any parser encountering non well-formed XML will reject it outright.

Ah, I see you're new to parsing XML.

1 more reply

crdoconnor9y ago

JSON parsers have a much smaller 'feature surface' meaning that there are fewer nooks and crannies for bugs to live in.

One example of a bug that often festered in XML parsers: https://en.wikipedia.org/wiki/Billion_laughs (there is no JSON equivalent of this)

The generalized theory, for those interested : https://en.wikipedia.org/wiki/Rule_of_least_power

zokier9y ago

While I'd agree that parsing JSON is much easier than XML, it is still not completely trivial as demonstrated by this article: http://seriot.ch/parsing_json.php

kr09y ago

What from I grok the guy requesting JSON-LD wants this functionality

jonknee9y ago

Probably this part:

> simpler to read and write

halloij9y ago

If you're writing these things by hand, you're probably doing something wrong...

3 more replies

skybrian9y ago

RSS is sometimes ambiguous and there's a lot of variation. It can be hard to parse correctly. Not sure about Atom, though.

CharlesW9y ago

> RSS is sometimes ambiguous and there's a lot of variation.

I've written a reasonably-popular podcast feed validator, and I don't understand either of these criticisms. Mind elaborating?

4 more replies

StevePerkins9y ago

I couldn't help but take a dismissive stance toward the rest of the page after reading the first paragraph.

ttepasse9y ago

Shortly after RSS 0.9 came out RSS 1.0 reformulated the RSS vocabulary in RDF terms. Of course the modern (sane) successor to RDF/XML is JSON-LD.

So I'm hoping for JSON-LD Feed 1.1 and a new war of format battles. Maybe we can even get Mark Pilgrim out of hiding!

toyg9y ago

Someone should open a social network for feed-wars veterans.

More seriously, it's sad so to see that almost 20 years later, the dream of a decentralised and bidirectional web is in even worse shape than it was back then.

bullen9y ago

Yes, extend this to JSON pingback and bring back the decentralized social web.

einrealist9y ago

If you create a new JSON-based document format, please consider to use JSON-LD (aside raw JSON data) so we can make a true world of interconnected data through semantic formats. At least, so I can generate code and automatically validate format compatibility from a well-defined schema. Thank you!

EDIT: Because I get downvoted despite stating my opinion on the topic, I adjusted the statement.

strictnein9y ago

No, please don't.

dabernathy899y ago

Well now I don't know what to think.

einrealist9y ago

Why not?

gwu789y ago

Is this a "JSON Feed" from NYTimes?

Example below filters out all URLs for a specific section of the paper.

   test $# = 1 ||exec echo usage: $0 section

   curl -o 1.json https://static01.nyt.com/services/json/sectionfronts/$1/index.jsonp
   exec sed '/\"guid\" :/!d;s/\",//;s/.*\"//' 1.json

I guess SpiderBytes could be used for older articles?

Personally, I think a protocol like netstrings/bencode is better than JSON because it better respects the memory resources of the user's computer.

Every proposed protocol will have tradeoffs.

To me, RAM is sacred. I can "parse" netstrings in one pass but I have been unable to do this with a state machine for JSON. I have to arbitrarily limit the number of states or risk a crash. As easy as it is to exhaust a user's available RAM with Javascript so too can this be done with JSON. Indeed they go well together.

pedalpete9y ago

"JSON has become the developers’ choice for APIs", I'm curious about how people feel about this statement from a creation vs consumption perspective.

I'm currently creating an API where I'm asking devs to post JSON rather than a bunch of separate parameters, but I haven't seen this done in other APIs (if you have, can you point me to a few examples?). I'm curious what others thoughts are on this. It seems that with GraphQl, we're maybe starting to move in this direction.

smilbandit9y ago

I'd like to see a language available at the item level. You can derive the language from the http headers but if you're dealing with linkblogs it would be nice at the item level to help with filtering.

I think that images and urls would do well as order lists rather than as individual values. at the top level you have 3 urls and an array for hubs. with type and url you could have an array for hubs and the urls. same could be done for images at the top level and both again at the item level.

niftich9y ago

It's unfortunate that XML has fallen so out of favor that well-made, strongly-schemad formats specified in XML, like Atom, are suffering in turn -- although reasons for feeds' demise go well beyond its forms-on-the-wire. This trend frustrates me, but it's undeniable that a lot of web data interchange happens with JSON-based formats nowadays, and the benefits of network effects, familiarity, and tooling support make JSONification worth exploring.

But even more frustrating is when a format comes out that's close to being a faithful translation of an established format, but makes small, incompatible changes that push the burden of faithful translation onto content authors, or the makers of third-party libraries.

I honestly don't intend to offer harsh targeted critique against the authors -- I assume good faith; more just voicing exasperation. There have been similar attempts over the years -- one from Dave Winer, the creator of RSS 0.92 and RSS 2.0, called RSS.js [1], which stoked some interest at first [2]; others by devs working in isolation without seeming access to a search engine and completely unaware of prior art; some who are just trying something unrelated and accidentally produce something usable [3]; finally, this question pops up from time to time on forums where people with an interest in this subject tend to congregate [4]. Meanwhile, real standards-bodies are off doing stuff that reframes the problem entirely [5] -- which seems out-of-touch at first, but I'd argue provides a better approach than similar-but-not-entirely-compatible riff on something really old.

And as a meta, "people who use JSON-based formats", as a loose aggregate, have a serious and latent disagreement about whether data should have a schema or even a formal spec. In the beginning when people first started using JSON instead of XML, it was done in a schemaless way, and making sense of it was strictly best-effort on part of the receiving party. Then a movement appeared to bring schemas to JSON, which went against the original reason for using JSON in the first place, and now we're stuck with the two camps playing in the same sandbox whose views, use-cases, and goals are contradictory. This appears to be a "classic" loose JSON format, not a strictly-schemad JSON format, not even bothering to declare its own mediatype. This invites criticism from the other camp, yet the authors are clearly not playing in that arena. What's the long-term solution here?

[1] http://scripting.com/stories/2012/09/10/rssInJsonForReal.htm... [2] https://core.trac.wordpress.org/ticket/25639 [3] http://www.giantflyingsaucer.com/blog/?p=3521 [4] https://groups.google.com/forum/#!topic/restful-json/gkaZl3A... [5] https://www.w3.org/TR/activitystreams-core/

0x006A9y ago

why is it size_in_bytes and duration_in_seconds as opposed to content_text and content_html

It should just be size and duration or size_bytes size_seconds (but adding units only makes sense if you could use other units). adding _in to the mix is strange.

gumby9y ago

A good announcement explains what problem it is intending to solve.

bullen9y ago

I miss the distributed social pingback days!

Implemented: http://sprout.rupy.se/feed?json

voidfiles9y ago

This seems like a great idea. If it can help even one developer it's worth it.

CharlesW9y ago

How would it help even one developer?

Or asked another way, what problem does this solve for you?

voidfiles9y ago

So, my personal blog doesn't get a ton of traffic, but the one article that gets the most traffic is an article about how to monkeypatch feedparser to not strip about embedded videos.

While not hard evidence, I think it's indicative of the kind of experience a developer has when they choose to engage with syndication.

cocktailpeanuts9y ago

Doesn't Wordpress already have something like this? http://v2.wp-api.org/

I don't understand why suddenly people treat this like something that uniquely solves a problem. Maybe I'm missing something?

yoz-y9y ago

This format is more akin to RSS than to a programmatic rest API. The main goal is to be able to avoid the pitfalls of parsing Atom and RSS feeds. Both Brent Simmons and Manton Reece are quite active in making decentralized alternatives for self publishing for which RSS is the current backbone.

donohoe9y ago

Parsing RSS and Atom feeds is a solved problem, no?

yoz-y9y ago

JSON Feed is a new solution for the problem already solved by RSS or Atom. It makes it easier to develop new publishers and consumers. It also tackles the main problems with these two formats, e.g.: no realtime subscriptions, mandatory titles which are a pain for microblogs, potential security problems with XML and so on.

Like somebody somewhere has written: If no one had ever reinvented the real wheel - our cars would be rolling around on big wooden logs

ozten9y ago

XML is aweful, but it does have CDATA, which lets you embed blog posts directly and it's easy to debug.

String encoded blog posts are going to be painful once people start using the `content_html` part of the spec.

__david__9y ago

Naw, JSON has reasonable quoting in the strings. It's maybe painful to read the raw json, but it encodes just fine.

pswenson9y ago

i'm surprised no one has started a snake vs camel case debate here! https://jsonfeed.org/version/1

nilved9y ago

Good lord, Web people, stop it. You are embarrassing yourselves. We already have standards and you need to stop recreating everything in JavaScript.

frou_dh9y ago

Brent Simmons is hardly some webdev kid barging it. He was the original developer of NetNewsWire, a very popular/influential feed reader application which is now 15(!) years old.

smilbandit9y ago

thanks, knew I recognized the name

systematical9y ago

Who uses feeds? Who uses XML?

ehosca9y ago

stopped reading after "JSON is simpler to read and write, and it’s less prone to bugs." ....

donohoe9y ago

I have grave concerns that this publishing format is delivered to us by two people that, as far as I can see, have limited to zero publishing background.

That said, they're being responsive to questions in Issues, so I remain optimistic.

acdha9y ago

Learning about the history of RSS should alleviate your concerns. Brent Simmons has been working in the space for 15 years, writing one of the more popular clients and working at a company which provided sync and syndication services:

https://en.wikipedia.org/wiki/NetNewsWire#History

j / k navigate · click thread line to collapse

215 comments

mstade9y ago

> JSON Feed files must be served using the same MIME type — application/json — that’s used whenever JSON is served.

Or as someone else rightly suggested, consider just using JSON-LD.

anshou-9y ago

It's worth pointing out that any valid JSON value is a valid JSON document. There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

"I am a valid JSON document. So is the Number below, and in fact every line below this line."

null

jameshart9y ago

Basically... JSON's just not as well defined a standard as you might hope.

So - pick a standard.

niftich9y ago

JSON is in fact defined in (at least) six different places, as described in the piece 'Parsing JSON is a Minefield' [1] (HN: [2]).

[1] http://seriot.ch/parsing_json.php [2] https://news.ycombinator.com/item?id=12796556

d0mine9y ago

Why are you referencing the obsolete rfc? There is no restriction to object/array for the JSON text in the current rfc https://tools.ietf.org/html/rfc7159

2 more replies

paulddraper9y ago

> There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

Alas, if only that were true.

RFC 4627:

> A JSON text is a serialized object or array. The MIME media type for JSON text is application/json.

RFC 7159:

IIRC, Ruby's JSON parser was written to be strictly RFC 4627 compliant, and yields a parser error for non-array non-object texts.

Since JSON isn't versioned so no one has any idea what "JSON" really means, or what "standard" is being followed.

mstade9y ago

tghw9y ago

It's more of a social agreement. If you get a JSON object from a place you expect a JSON Feed and it has a title and items, then it'll probably work, even if it omits other things.

1 more reply

debaserab29y ago

niftich9y ago

true

hyperpallium9y ago

false

eriknstr9y ago

Absolutely agree about the MIME type.

Someone filed an issue and created a pull-request for this after you wrote this comment.

https://github.com/brentsimmons/JSONFeed/issues/22

https://github.com/brentsimmons/JSONFeed/pull/23

I hope they will merge it.

mstade9y ago

That's great, thanks for sharing!

rspeer9y ago

Should this web page have been served as `text/hacker-news-comment-thread+html`?

niftich9y ago

This is the same difference in schools that I express in a different comment [3] in this thread.

[1] https://tools.ietf.org/html/rfc6839 [2] https://tools.ietf.org/html/rfc3023#appendix-A [3] https://news.ycombinator.com/item?id=14361842

eridius9y ago

No, because this web page isn't using a specialized format.

But if it were using XHTML, then the proper mime type would be application/xhtml+xml.

foota9y ago

It seems that all this spec is, is a structure for an api response. I don't see why it should have a different media type.

martin-adams9y ago

So the server could support all of the following:

application/jsonfeed

application/rss+xml

application/atom+xml

Who knows, maybe RSS and ATOM could be represented in JSON and have the following mime types:

application/rss+json

application/atom+json

If it's just an API response, and it is your API for an application called Widget Factory, then you can, if you want, have your own format:

application/vnd.widgetfactory+json

Generally, defining such a mime type should have some specification describing it otherwise no client can reliably implement a compatible client. Jsonfeed have proposed that specification.

tootie9y ago

Well, if JSON had namespaces or standard validation framework, we could have that conversation.

vog9y ago

You mean, like JSON schema?

http://json-schema.org/

Not sure why you want to emulate XML namespaces in JSON, but JSON schemas can include other JSON schemas and extend upon other JSON schemas. That accounts for 99.9% of the use cases for namespaces.

mstade9y ago

mindcrime9y ago

Do we really need this? Atom is fine for feeds. Avoiding XML just for the sake of avoiding XML, because it isn't "cool" anymore is just dump groupthink.

If this industry has a problem, it's FDD - Fad Driven Development and IIICIS (If It Isn't Cool, It Sucks) thinking.

pfranz9y ago

mindcrime9y ago

But even in established languages I've had trouble finding an appropriate xml parser and had to tweak them way more than I thought necessary. I haven't (yet) had that problem with JSON.

2 more replies

dabeeeenster9y ago

What language are you using that doesn't have a working XML parser? REALLY?

1 more reply

djur9y ago

tetrep9y ago

1 more reply

armandososa9y ago

It is very likely than I am an idiot, but I've always found parsing XML too hard, specially compared to JSON which is almost too easy.

martijndwars9y ago

1 more reply

mstade9y ago

2 more replies

chc9y ago

hayleox9y ago

falcolas9y ago

I give you libxml:

    xmllint --xpath '//element/@attribute'

There's a good chance it's already installed on your mac.

1 more reply

chriswarbo9y ago

There are a few nice XML processing utilities. I tend to use xmlstarlet and/or xidel. This lets me use XPath, jQuery-style selectors, etc.

I agree that jq is really nice though. In particular, I still find JSON nicer than XML in the small-scale (e.g. scripts for transforming ATOM feeds) because:

- No DTDs means no unexpected network access or I/O failures during parsing

- No namespaces means names are WYSIWYG (no implicit prefixes which may/may not be needed, depending on the document)

- All text is in strings, rather than 'in between' elements

- No redundant element/attribute distinction

http://xmlstar.sourceforge.net

http://www.videlibri.de/xidel.html

Animats9y ago

[1] https://github.com/John-Nagle/rust-rssclient

sillysaurus39y ago

Surely there's an xml->json converter somewhere.

duskwuff9y ago

It's kind of tough to convert XML directly to other formats (including, but not limited to, JSON), because there are a lot of XML features that don't map cleanly onto JSON, such as:

• Text nodes (especially whitespace text nodes)

• Comments

• Attributes vs. child nodes

• Ordering of child nodes

1 more reply

matthewaveryusa9y ago

aloisdg9y ago

The Atom spec is really easy to grasp. Your platform may even include a way to deal with it ([.NET](https://msdn.microsoft.com/en-us/library/system.servicemodel... for example)

mindcrime9y ago

oefrha9y ago

revelation9y ago

Yeah this is great, now instead of properly machine-readable and verifiable XSD files we have pseudo-RFC text on some shitty GitHub page.

robgibbons9y ago

Basically, XML is to JSON as SOAP is to REST. It had it's day, though it's obviously still useful, but we have better tools now. Frankly, I'm surprised we haven't seen a proposal like this sooner.

stephenr9y ago

> XML is to JSON as SOAP is to REST

That's true. Both XML and SOAP are well defined, and well structured.

JSON and REST are both marginally defined, and thus we see constant incompatible/incomplete implementations, or weird hacks to overcome the shortcomings.

> we have better tools now

I think "the cool kids are cargo-culting something newer now" is probably more accurate.

1 more reply

liuyanghejerry9y ago

hyperpallium9y ago

In practice, json is much easier to work with on the command line because of jq.

wcummings9y ago

No, we don't. This doesn't do anything except break compatibility.

weberc29y ago

Yikes, you didn't even make it to the second sentence.

> JSON is simpler to read and write, and it’s less prone to bugs.

mindcrime9y ago

JSON is simpler to read and write, and it’s less prone to bugs.

I don't actually find either of those things to be true.

2 more replies

jwilk9y ago

From the HN guidelines:

Please don't insinuate that someone hasn't read an article.

1 more reply

metheus9y ago

We don't prefer JSON to XML for any reason other than that XML is terrible by comparison.

dangerlibrary9y ago

2 more replies

bdr9y ago

That's not a reason.

russellbeattie9y ago

For anyone who's tried to write a real-world RSS feed reader, this format does little to solve the big problems the newsfeeds have:

* Badly formed XML? Check. There might be badly formed JSON, but I tend to think it'll be a lot less likely.

This stuff is just off the top of my head. If you're going to make a new feed format in 2017, I'm sorry but copying what came before it and throwing it into JSON just isn't enough.

hboon9y ago

https://en.wikipedia.org/wiki/NetNewsWire

nothrabannosir9y ago

1 more reply

toyg9y ago

One has to wonder whether Simmons is just trying to revive the old RSS ecosystem. "What do developers like these days, JSON? Let's do RSS in JSON!" ... This does not help.

lucideer9y ago

icebraining9y ago

Plus, you can always reuse PubSubHubBub (now WebSub[1]), which is already used in RSS/Atom feeds to provide optional subscribing to updates if both the server and client support it.

[1] https://www.w3.org/TR/websub/

mindcrime9y ago

derefr9y ago

> Need to continually poll servers for updates? Miss.

---

ttepasse9y ago

HTTP 301 Moved Permanently is the out of band control channel. Sometimes it even seems to work, depending on software of course.

There was also a typical Dave-Wineresque invention of replacing the old feed with some special, non-namespaced XML with the redirect: http://www.rssboard.org/redirect-rss-feed

1 more reply

CharlesW9y ago

Dave Winer (the creator of RSS) played with this a bit in 2012. It turns out that exact format of feeds doesn't matter nearly as much as there being a more-or-less universal one.

http://scripting.com/stories/2012/09/10/rssInJsonForReal.htm...

AceJohnny29y ago

I'm sure there's...

oh of course: https://xkcd.com/927/

(and I realize this doesn't exactly map, as JSON Feed isn't even trying to cover all the usecases of Atom or RSS, just switching the container format)

gedrap9y ago

But does it solve any actual problems other than 'XML is not cool', problems big enough to deserve a new format?

>>> It reflects the lessons learned from our years of work reading and publishing feeds.

tannhaeuser9y ago

Yes JSON is much easier to parse than XML, and is preferred when it fits such as for most Web API requests and responses.

However, SGML and XML were invented as structured markup languages for authoring of rich text documents by humans, for which JSON is unsuited and sucks just as much as XML sucks for APIs.

zeveb9y ago

If we're going to talk about replacing XML with better data formats, why not switch to S-expressions?

    (feed
     (version https://jsonfeed.org/version/1)
     (title "My Example Feed")
     (home-page-url https://example.org)
     (feed-url https://example.org/feed.json)
     (items
      (item (id 2)
            (content-text "This is a second item.")
            (url https://example.org/second-item))
      (item (id 1)
            (content-html "<p>Hello, world!</p>")
            (url https://example.org/initial-post))))

This looks much nicer IMHO than their first example:

    {
        "version": "https://jsonfeed.org/version/1",
        "title": "My Example Feed",
        "home_page_url": "https://example.org/",
        "feed_url": "https://example.org/feed.json",
        "items": [
            {
                "id": "2",
                "content_text": "This is a second item.",
                "url": "https://example.org/second-item"
            },
            {
                "id": "1",
                "content_html": "<p>Hello, world!</p>",
                "url": "https://example.org/initial-post"
            }
        ]
    }

krapp9y ago

JoelSanchez9y ago

There's EDN, which is to Clojure what JSON is to JS: a format close to the language's way of representing data.

https://github.com/edn-format/edn

Example:

https://github.com/milikicn/activity-stream-example/blob/4db...

Not S-expression-based, though.

draegtun9y ago

JSON was influenced by Rebol, which i feel would provide an even nicer example:

    version: https://jsonfeed.org/version/1
    title: "My Example Feed"
    home-page-url: https://example.org
    feed-url: https://example.org/feed.json
    items: [
        [
            id: 2
            content-text: "This is a second item."
            url: https://example.org/second-item
        ]
        [
            id: 1
            content-html: "<p>Hello, world!</p>"
            url: https://example.org/initial-post
        ]
    ]

hajile9y ago

Those two aren't comparable because you cannot distinguish between key:val pairs and lists. You need dotted lists.

zeveb9y ago

If you really want to do a hash table, you could represent it as an alist:

    (things
      (key1 val1)
      (key2 val2))

1 more reply

Johnny_Brahms9y ago

There is of course sxml, which is more or less well-defined. But that would probably suffer from the same problems as xml, since it is just xml written as s-expressions.

foxhill9y ago

beauty is in the eye of the beholder - i personally prefer the JSON.

Communitivity9y ago

smilbandit9y ago

Thanks +1, didn't know that.

eric_the_read9y ago

A few thoughts on the spec itself:

derefr9y ago

> a podcast might want to use this field for each of its hosts, or possibly even guests

I'm going to pretend this is about music artists in a music library, but the logic is exactly the same for podcast hosts:

You tend to want fields like this to be singular, so that the field can be used in collation (i.e. "sort by artist.")

• TCOM: Composer

• TEXT: Lyricist/Text writer

• TPE1: Lead performer(s)/Soloist(s)

• WOAR: Official artist/performer webpage

• TPE2: Band/orchestra/accompaniment

• TPE3: Conductor/performer refinement

• TPE4: Interpreted, remixed, or otherwise modified by

• TENC: Encoded by

• WOAS: Official audio source webpage

• TCOP: Copyright message

• WPUB: Publishers official webpage

• TRSN: Internet radio station name

• TRSO: Internet radio station owner

• WORS: Official internet radio station homepage

That gives you separate credits for pretty much the entire composition, production and distribution flow, which usually means that each field only needs one entry.

Would be great if people used them, wouldn't it? Maybe the semi-standard "A feat. B (C remix)" microformat could be parsed into "[TPE2] feat. [TPE1] ([TPE4])"...

smilbandit9y ago

I was thinking that all the urls and images should have been in arrays.

eric_the_read9y ago

jerf9y ago

I would suggest specifying titles as html, not plain text. I've seen too many things titled "I <i>love</i> science!" over the years to believe in the idea that titles are plain text.

ergothus9y ago

Allowing HTML means the other side will have to validate that HTML (to avoid XSS). Using text means you can stick in the DOM using innerText() and be much more confident that you aren't injected XSS.

jerf9y ago

ianburrell9y ago

Following the separation of content_text and content_html attributes, it would make sense to have title_html and title_text attributes.

jawns9y ago

> It's at version 1, which may be the only version ever needed.

Wow. Now that's confidence. Have you ever read the first version of a spec and thought, "That's just perfect. Any additional changes would just be a disappointment compared with the original"?

Johnny_Brahms9y ago

MIDI 1.0 is maybe not perfect, but it is still unchanged since 1983. People have tried to replace it for 2 decades, but failed to provide any enhancements worth a switch.

But MIDI doesn't really fit that description since it builds on 2 years of work by Roland. My best bet though.

vanderZwan9y ago

As far as scenarios where it's feasible to get the answer right the first time go, this is a reasonably realistic one.

EDIT: Also, if you scroll to the bottom of the page you can see they have let a whole bunch of people look at the spec before releasing it, so there has been at least some peer review.

efsavage9y ago

Unsurprising as this is clearly an ego play, given that the first thing they want you to know is their names.

smacktoward9y ago

"Now it belongs to the ages!"

pimlottc9y ago

> JSON is simpler to read and write, and it’s less prone to bugs.

Less prone to bugs? How's that?

bastawhiz9y ago

jjawssd9y ago

> XML also has a lot of room for syntax errors, with many types of tokens and escape rules. JSON, by comparison, does not.

Parsing JSON is a minefield.

More details available at: http://seriot.ch/parsing_json.php

2 more replies

josteink9y ago

> XML also has a lot of room for syntax errors,

No it doesn't. XML is either well formed or not, and any parser encountering non well-formed XML will reject it outright.

Therefor all XML in use on the internet is spec-compliant.

Now try to say the same about JSON.

bastawhiz9y ago

> any parser encountering non well-formed XML will reject it outright.

Ah, I see you're new to parsing XML.

1 more reply

crdoconnor9y ago

JSON parsers have a much smaller 'feature surface' meaning that there are fewer nooks and crannies for bugs to live in.

One example of a bug that often festered in XML parsers: https://en.wikipedia.org/wiki/Billion_laughs (there is no JSON equivalent of this)

The generalized theory, for those interested : https://en.wikipedia.org/wiki/Rule_of_least_power

zokier9y ago

While I'd agree that parsing JSON is much easier than XML, it is still not completely trivial as demonstrated by this article: http://seriot.ch/parsing_json.php

kr09y ago

What from I grok the guy requesting JSON-LD wants this functionality

jonknee9y ago

Probably this part:

> simpler to read and write

halloij9y ago

If you're writing these things by hand, you're probably doing something wrong...

3 more replies

skybrian9y ago

RSS is sometimes ambiguous and there's a lot of variation. It can be hard to parse correctly. Not sure about Atom, though.

CharlesW9y ago

> RSS is sometimes ambiguous and there's a lot of variation.

I've written a reasonably-popular podcast feed validator, and I don't understand either of these criticisms. Mind elaborating?

4 more replies

StevePerkins9y ago

I couldn't help but take a dismissive stance toward the rest of the page after reading the first paragraph.

ttepasse9y ago

Shortly after RSS 0.9 came out RSS 1.0 reformulated the RSS vocabulary in RDF terms. Of course the modern (sane) successor to RDF/XML is JSON-LD.

So I'm hoping for JSON-LD Feed 1.1 and a new war of format battles. Maybe we can even get Mark Pilgrim out of hiding!

toyg9y ago

Someone should open a social network for feed-wars veterans.

More seriously, it's sad so to see that almost 20 years later, the dream of a decentralised and bidirectional web is in even worse shape than it was back then.

bullen9y ago

Yes, extend this to JSON pingback and bring back the decentralized social web.

einrealist9y ago

EDIT: Because I get downvoted despite stating my opinion on the topic, I adjusted the statement.

strictnein9y ago

No, please don't.

dabernathy899y ago

Well now I don't know what to think.

einrealist9y ago

Why not?

gwu789y ago

Is this a "JSON Feed" from NYTimes?

Example below filters out all URLs for a specific section of the paper.

   test $# = 1 ||exec echo usage: $0 section

   curl -o 1.json https://static01.nyt.com/services/json/sectionfronts/$1/index.jsonp
   exec sed '/\"guid\" :/!d;s/\",//;s/.*\"//' 1.json

I guess SpiderBytes could be used for older articles?

Personally, I think a protocol like netstrings/bencode is better than JSON because it better respects the memory resources of the user's computer.

Every proposed protocol will have tradeoffs.

pedalpete9y ago

"JSON has become the developers’ choice for APIs", I'm curious about how people feel about this statement from a creation vs consumption perspective.

smilbandit9y ago

niftich9y ago

0x006A9y ago

why is it size_in_bytes and duration_in_seconds as opposed to content_text and content_html

It should just be size and duration or size_bytes size_seconds (but adding units only makes sense if you could use other units). adding _in to the mix is strange.

gumby9y ago

A good announcement explains what problem it is intending to solve.

bullen9y ago

I miss the distributed social pingback days!

Implemented: http://sprout.rupy.se/feed?json

voidfiles9y ago

This seems like a great idea. If it can help even one developer it's worth it.

CharlesW9y ago

How would it help even one developer?

Or asked another way, what problem does this solve for you?

voidfiles9y ago

So, my personal blog doesn't get a ton of traffic, but the one article that gets the most traffic is an article about how to monkeypatch feedparser to not strip about embedded videos.

While not hard evidence, I think it's indicative of the kind of experience a developer has when they choose to engage with syndication.

cocktailpeanuts9y ago

Doesn't Wordpress already have something like this? http://v2.wp-api.org/

I don't understand why suddenly people treat this like something that uniquely solves a problem. Maybe I'm missing something?

yoz-y9y ago

donohoe9y ago

Parsing RSS and Atom feeds is a solved problem, no?

yoz-y9y ago

Like somebody somewhere has written: If no one had ever reinvented the real wheel - our cars would be rolling around on big wooden logs

ozten9y ago

XML is aweful, but it does have CDATA, which lets you embed blog posts directly and it's easy to debug.

String encoded blog posts are going to be painful once people start using the `content_html` part of the spec.

__david__9y ago

Naw, JSON has reasonable quoting in the strings. It's maybe painful to read the raw json, but it encodes just fine.

pswenson9y ago

i'm surprised no one has started a snake vs camel case debate here! https://jsonfeed.org/version/1

nilved9y ago

Good lord, Web people, stop it. You are embarrassing yourselves. We already have standards and you need to stop recreating everything in JavaScript.

frou_dh9y ago

Brent Simmons is hardly some webdev kid barging it. He was the original developer of NetNewsWire, a very popular/influential feed reader application which is now 15(!) years old.

smilbandit9y ago

thanks, knew I recognized the name

systematical9y ago

Who uses feeds? Who uses XML?

ehosca9y ago

stopped reading after "JSON is simpler to read and write, and it’s less prone to bugs." ....

donohoe9y ago

I have grave concerns that this publishing format is delivered to us by two people that, as far as I can see, have limited to zero publishing background.

That said, they're being responsive to questions in Issues, so I remain optimistic.

acdha9y ago

https://en.wikipedia.org/wiki/NetNewsWire#History

j / k navigate · click thread line to collapse