- the content-type of the page is "text/html", so the browser is trying to render html
- there is no special meaning of the #render key to the browser (again the browser doesn't know this is json)
- browsers are very fault tolerant so they'll just skip over your document till they find pieces of html
- the JS made by the author is the part that parses the json as json, and it uses the #render key as its metadata section
You can try opening a file like this to test for yourself to get a sense of just the browser-parsing part, test.html:
{
"test": "hello",
"myHtml": "<html><meta charset=utf-8><h1>Hello World</h1></html>"
}
I do however get this warning in FireFox so perhaps this is pretty fragile:> The character encoding declaration of the HTML document was not found when prescanning the first 1024 bytes of the file. When viewed in a differently-configured browser, this page will reload automatically. The encoding declaration needs to be moved to be within the first 1024 bytes of the file.
I'm guessing differently-configured means not in quirks mode? Does quirks mode just make the browser extra fault tolerant?
> I'm creating a blog platform using this concept.
Yeah, please don’t; this is an abomination that’s fun for demonstrating and teaching how these things work, but should absolutely never be used in reality.
You could use this as your data model foundation upon which to build a generator, but you should never under any circumstances actually serve this stuff.
# Markdown header
## Subheader
### Section header
1. Numbered
1. List
- Unordered
- List
[//]: # (<html><body></body><script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script><script>var doc = document.children[0].textContent.split('\n'); md = doc.slice(0, doc.length - 1).join("\n"); document.body.innerHTML = marked(md);</script></html><!--)
That last line has varying degrees of invisibility in different markdown viewers I looked at.
Obviously there's optimization that could be had here but this is equally hacky IMO and simpler since you can just write MD instead of JSON. You lose a couple of key features though -- templated components for example. However, I imagine you could shoehorn those in without much effort.This has the advantage that without JS enabled you'll get poorly formatted markdown in the browser.
data:text/html;charset=utf-8,%7B%20%22foo%22%3A%20%22bar%22%2C%20%22baz%22%3A%20%5B%20%22qux%22%20%5D%2C%20%22%23render%22%3A%20%22%3Chtml%3E%3Cbody%3E%3Cdiv%20id%3Dcontent%3E%3Ch1%3EMy%20fancy%20document%3C%2Fh1%3E%3Cp%3EThis%20is%20a%20completely%20%26quot%3Bnormal%26quot%3B%20HTML%20document.%3C%2Fp%3E%3C%2Fdiv%3E%3Cstyle%3Ebody%20%7B%20visibility%3A%20hidden%3B%20%7D%20%23content%20%7B%20visibility%3A%20initial%3B%20position%3A%20absolute%3B%20top%3A%200%3B%20%7D%3C%2Fstyle%3E%3C%2Fbody%3E%3C%2Fhtml%3E%22%20%7D
You can paste this directly into your URL bar, and view source to see the valid JSON document.An oft-unheard voice of wisdom and sane engineering practices speaks. Hark, ye mortals.
I have JS disabled by default. Until I read your comment I was just confused what was going on.
data:text/html,{"data":"Hello.","whatever":"<script>onload=function(){document.body.innerHTML='<h1>'+JSON.parse(document.body.innerHTML).data+'</h1>'}</script>"}I put the meta tag to render some special characters without relying on the server, but HTTP header would be a better option (Content-Type: text/html; charset=utf-8).
And you nailed the process.
For the third one, there's no "skipping over" involved. The <html> and <body> opening tags are optional in HTML. The browser sees some non-whitespace text (the opening '{') not in the context of any other tag, auto-opens those tags, and the text starts being parsed as body text.
If the page loaded slowly enough, over multiple packets that arrived separated in time, you would see the JSON text coming in and rendering as text, with all the curlies and all, until the browser gets to the <html hidden> part of the JSON. At that point, some common-error fixup dating back to the Netscape/IE 3 days kicks in: when you see an <html> or <body> tag and one has already been opened (due to some stray content at the beginning of the file, likely), you don't open a new one, but copy the attributes to the existing one. This copies the "hidden" attribute to the <html>, which hides it. After that either the script executes and does its work, per your fourth bullet point, or script is disabled and the style rule inside <noscript> unhides the <html> element, but at that point you will of course just see the JSON text, parsed as HTML.
> I do however get this warning in FireFox so perhaps this is pretty fragile:
I'm not sure whether the site changed since then, but now it's sending `charset=utf-8` in the content-type header, so there's no meta prescan at all. Which is what allows the emoji after "Need Help?" to be decoded correctly.
Were you seeing the meta prescan warning on your local test file, or on the site itself?
> I'm guessing differently-configured means not in quirks mode?
No, it means things like preferences about how to handle pages without encoding declarations via various heuristics (e.g. scanning byte value frequencies and guessing what character encoding is in use based on that).
Quirks mode is determined solely by the doctype of the page. This page has no doctype (since it starts with a '{'), so it's always in quirks mode. And what quirks mode does is enable a set of behaviors designed to not break content written more or less in the "before HTML 4.0" era. https://quirks.spec.whatwg.org/ should have a more or less exhaustive list of quirks mode behaviors browsers are expected to implement. Some of these are extra-fault-tolerance (e.g. the hashless hex color and unitless length quirks), while some are just replicating pre-CSS browser rendering behavior (like the line height calculation quirk, which a bunch of sliced-image stuff relies/relied on).
> The JSON must contain only pure information without any concern about design or markup.
Wellll.. I mean the json has markdown syntax in it. That's a lot nicer than html tags for markup, but it's still markup.
In case anyone reading hasn't seen it before, browsers have a thing called XSLT built into them that does something similar for XML documents. You serve up your data as XML, add a tag that points to an XSLT document, and the browser will use the transform on your document automatically. If the resulting output is renderable XHTML, it'll display it like a regular webpage.
That being said, I wouldn't recommend doing that, since XSLT is a programming language whose syntax is XML itself. Apparently web standards folks in the early 2000s thought the future was XML all the way down.
It's very cool, but also, very strange, if you come from a procedural language background.
Also, XSLT is really given power by being mixed with XPath (which is procedural).
I suspect that if I had known more about FP back when I learned it, I would have had an easier time. I wrote up a really long, painful post about XSLT, way back in the early 'oughts.
XPath is not procedural. It is entirely composed of expressions that compose and return results. The closest common language might be SQL.
For example, if you build a social network profile page using XSLT, alice.xml and bob.xml can reference the same profile.xsl stylesheet which converts the XML profile data to the rendered HTML page. Since the profile.xsl stylesheet can be cached, whenever a new profile is visited, only the profile data in XML needs to be transferred.
Note that the templating can also be used to save data when rendering a single page. Imagine a Twitter clone that displays a list of tweets. If, hypothetically, 80 characters of Tweet data plus 500 characters of XSL template, produce 250 characters of HTML markup, you only need 3 tweets to start saving data (250x3=750 > 500+3x80=740).
It should have been, but just like the masses rejected LISP for its parens, the masses rejected XML for the closing tag. We could have avoided PHP and the zoo of MVC frameworks.
External entities, namespaces within namespaces, CDATA, namespaces that confusingly look exactly like URLs, but aren't, parser vulnerabilities.
I doubt that. Javascript frameworks have their value inside a corporation - they enable replaceable cog worker programming. The result sucks, but hiring and replacing React programmers is easier.
I don't think they actually do anymore, not all of them for sure.
> web standards folks in the early 2000s thought the future was XML all the way down
Yes. What fools, to think that machine-produced content transmitted from machine to machine and displayed throw other machines would be better handled and more accessible if formatted in machine-friendly ways that were still readable to humans...
But it wasn't to be, not just because XSLT was somewhat botched (way too hard, way too many enterprise-looking warts...) and probably insecure, but because people are too lazy to produce readable markup (who wouldn't get tired of reading <prst><artdd><!CDATA[[ all day long...) and to make sure tags appear only where they should and get closed properly. Add the inevitable touch-of-death of enterprise vendors ("let's use xml in such a way you'll only be able to deal with it with our tools!"), and the game was up.
Isn't markdown literally just HTML shortcuts for a limited set of common markup cases? IIRC, it's not uncommon to have to insert HTML tagging into markdown to get certain things to work.
Assuming it's a superset though, the part I was talking about above was the (markdown - html) part, where markdown adds less noisy syntax for bolding, headings, lists etc.
I think it was called "Where," or "There," or something.
The only advantage I see is that you can actually support markdown like you do in your page, since it's less verbose than HTML-tags and doesn't make the text unreadable if you read it as plain text.
But with HTML5 you can introduce arbitrary tag names to pretty much get the same with an XML-like structure instead of json. Just use <subtitle> <what> <description> and supply CSS for it. If you want to consume it with something else, swap out the JSON parser for an XML one, navigate to the body tag and from there on it's the same thing.
I mean it's a cool trick you came up with, but it doesn't seem worth the effort, and relying on quirks mode seems brittle.
<form method="POST" enctype="text/plain" action="http://example.com">
<input name='{"key1":"val1","params":{"input":"value","list":[],},"dummy":"' value='"}' hidden>
<button>Submit</button>
</form>
The important bit is to include a "dummy" key at the end of the JSON object, and an input value that closes the quotes and any open curl braces. That way the "=" character sent in the encoding of the form elements doesn't interfere with the meaningful JSON content.There might be a clever way to get it to submit dynamic JSON that changes based on user input without JavaScript, but I haven't thought enough about it.
This technique is sometimes useful for CSRF attacks.
{
"key1": "val1",
"params": {
"input": "value",
"list": []
},
"dummy": "="
}
If the JSON is just a hidden form value as you suggest, the request as a whole will not be treated as JSON data. Then invalid characters will (usually) be added to the request body by the browser, and the server will (probably) be unable to parse it, causing the request to fail. This is due to how forms are encoded for POST requests.On the other hand, if you're wondering why anyone would ever do this, then I do not have a good answer for you :)
{
"content": [
["html", {},
[["head", {}, []],
["body", {},
[["h1", { id: "main-header" }, [
"Welcome to my",
["span", { class: "red-text" },
["PAGE"]]]],
["h2", { id: "sub-header" }, [
"It's so cool.",
["br", {}, nil],
"Don't you think?"]]]]]]]};Example of a schema: https://github.com/ProseMirror/prosemirror-schema-basic/blob...
Example of a document converted to JSON: https://tiptap.dev/export
[0] escherize.com/w/hiccup.space
[1] escherize.com/w/cljsfiddle
I didn't create the project mind you, it was some NPM package back when I was a fresh 'un who didn't have thoughts on using a billion subdependencies
Anyway, I was nearly broke and when I went to apply for the benefit, I was asked for my resume as a word doc
Clearly, you can imagine how this conversation went down but I had brought a PDF on a USB which I offered to print out instead.
The clerk refused to let me plug in my USB for fear I was going to "hack" her and the HTML page, saving as a PDF with Chrome, took some convincing to ask her to navigate to so it could be printed
I guess the lesson here is that if you expect to run out of funds, make sure you have your most essential documents stored via Microsoft Word?
I have my resume defined using this schema, which is handy because it's supported enough that every few years when I actually need to update it, I can find a website (such as https://resumake.io) where I can just copy in the JSON and get a nicely formatted PDF out of it.
Microsoft Word is fine too but I would still use something like PDF for export and sharing (still not sure how Microsoft Word solves the usb problem for you).
Generally though I like something text-like that I can revision control more easily/edit in vim. Maybe I’ll switch to something like pan doc going forward (then you can generate word if they really want it).
If you had this experience again, there would likely be some other barrier.
Most of the big MVC frameworks offered this out of the box at some point, which made life easier before dedicated RESTful APIs became a thing.
This is a nice project, but it isn't valid HTML.
https://validator.w3.org/nu/?doc=https%3A%2F%2Fwebdatarender...
Any feedback is appreciated!
An example of the output is at http://trout.me.uk/perl/plx.html (that one's actually served over http, it's a demo, please don't blindly run it).
If you were writing a desktop application, you could be saving user data as this JSON-HTML quine (JHQ?), so it's automatically accessible on any device, but also still accessible to e.g. the jq utility.
Since any modification is liable to move the "_" field, you might address that by using an array envelope instead; [{"real": "data"}, "<html><footer>"]
For a production site, I think you ought to, server-side, check the Accept header and pre-render the conversion.
Another note: if I do Save Page As in Firefox, it saves the page as standard HTML, losing the JSON.
The page is rendered in QUIRKSMODE because the source is missing the <!DOCTYPE html> at the beginning of the document, to resemble valid JSON.
Well, at least that's clever!
Also, send me a link, I would love to check it out!
> The JSON must contain only pure information without any concern about design or markup. All the design and markup required to render the page must be inside the rendering script.
The source code has markup in the form of Markdown, e.g. ## or * text
For example, the basic render "# Title" creates an h1 header, but if you created the render, you know the internal schema of the JSON, so you can use the correct HTML element for that without recurring to this hack.
But the principle is important: try not to use markup on the JSON, because this would make it difficult for others to consume your website with JSON.
Since there is no standard for representing rich text in JSON, if you want rich text, this is simply an unavoidable problem. At least Markdown is semi-standard and people can get libraries for it. Embedded HTML would also work. Defining an ad-hoc rich text embedded would be worse.
The index page at this point is mostly just a DOM skeleton on which to hang references to CSS, media, scripts and metadata. We might as well cut out the last step already.
If there was an official HTML-to-JSON format, we could even use that as well. It would probably already exist, but the question is always what to do with node attributes, text nodes and child-nodes. There's a dozen ways to organize them in JSON.
tldr: the html is stuffed in the exif data!
[1] https://gist.github.com/gasman/2560551
[2] https://github.com/codegolf/zpng
[3] https://xem.github.io/terser-online/ (If you pick the packing method "Zopfli (DEFLATE)", open the zopfli options and change the format to "zpng", you can write directly to the "Minified (Terser)" input to download the optimized PNG. Yes I wrote that part of code and that required way too much algorithm: https://github.com/xem/terser-online/blob/5cc33125/compress....)
Now that I have built and have played around a bit, I don't see how can I build a website in any other way.
For me, the website's information is important, not the design. WDR exposes that.
Did you play around with just using markdown directly and embedding html tags in that in a similar way to format the page? I can't imagine myself intentionally writing web content using JSON. I'd probably have some other format up front that would be converted to JSON, but at that point I may as well just write some static page generator to create the html.
I think a hard problem ahead would be how to convince other people to also adopt this framework, and how to make sure that people are using the same JSON keys to mean the same thing. I'm a little skeptical that this can happen on its own, because we've already tried a standard that was supposed to designed to convey just the content of the website without the presentation -- HTML.
I'm curious how you plan to prevent your JSON format from being (ab)used for presentation purposes, where people add extra content to make the page display a certain way. And if you have a plan, is this plan feasible using HTML as well?
* We have an XSL stylesheet that ignores input and renders out the HTML contained in "#render"
* Along with the JSON file, we send a header `Link </stylesheet.xsl>; rel="stylesheet"; type="text/xsl"`
* We send the JSON with `Content-Type application/xml`
* Browser renders the HTML via the stylesheet and then the JavaScript takes over and renders the page without quirks mode
Sadly, this doesn't work because when the browser can't parse the JSON as XML, it stops processing and doesn't call the stylesheet :(
HTML1527: DOCTYPE expected. Consider adding a valid HTML5 doctype: "<!DOCTYPE html>". webdatarender.com (2,0)
HTML1513: Extra "<html>" tag found. Only one "<html>" tag should exist per document. webdatarender.com (86,15)
SCRIPT1002: Syntax error render-basic-1.0.3.js (1,4)
HTML1506: Unexpected token. webdatarender.com (86,104)
I tried to solve the quirks mode, but it is kind of hilarious that the browser needs the odd <!doctype html> to flip to standard mode. There are ways to inform the doctype on the headers or XHTML, but this would complicate a simple solution. But I was quite surprised how consistent is the look on different browsers, at least in the modern ones.
I'll accept that answer. As controversial as this comment may be, XML/XSLT is a very good fit for this purpose. It might not be particularly modern, however it's almost universally supported.
Cool, but why?
> An object is an unordered set of name/value pairs.
But this site seems to be assuming that the key/value pairs are parsed in their original ordering for everything to display properly. Is it safe to assume JavaScript will always parse the keys in order?
And then JSON.parse is defined as doing the obvious and only sane thing, setting each property as it goes.
Consequently, so long as your keys are not numeric, yes, JavaScript guarantees that it will all be in order.
But that’s for JavaScript. It is incorrect to treat JSON objects as ordered, because various libraries in various languages will discard the order for various reasons (e.g. efficiency, DoS resistance), since it is defined as being not significant. If you care about the order of things, use an array instead.
Technically, no. Functionally, probably yes.
Any browser vendor that decides to muck with the current status quo would likely break enough code to be a non-starter.
You could use this to syndicate blog posts, sort of like RSS, except each entry is JSON and could be viewed as it's own page rendered by it's own renderer (carried by a CDN and cached by the browser), or by the renderer of the users choice.
I know it's a complex idea, and it's hard to grasp ideas when they're first proposed, but the payoff is often worth the time invested. Even though it's only been <checks notes> 24 years since the RFC defining the Accept header was published, maybe it'd be worth spending say, 10 minutes reading about what it is, rather than however long it took you to write this abomination?
My immediate thought for a use case is to debug APIs. Pass in a param that adds this to the response and get it in a more human readable form. Will test it out.
Wasn't XHTML (and thus html5) supposed to be parsable, since it's basically XML (specifically, the html spec redefined as XML) ?
https://wgx.github.io/anypage/
(It's the world's worst CMS)
As a blogging platform, please don't do this. It breaks all SEO, microformats, RSS feed discovery, rel=me ties, and much, much more.
HTML is great. Please use it for your websites.
source: runs a open-web friendly microblogging platform
Nice project though!
Only that my goal was to create a browser for that an alternative to the web.
It would work on the same principle of separating data from information
Meaning: If the javascript (and other content) loaded via #render is highly cacheable, can this lead to pages that display as soon as the JSON is loaded?
That's what HTML and CSS is, no?
```html
{
"#info": {
"title": "WDR"
},
"subtitle": "Web Data Render",
"title": "# WDR",
"what": {
"title" : "## What is WDR?",
"description": [
"This website is a valid **[JSON](//www.json.org/)**!",
"Check the source code. Instead of the habitual HTML and CSS, you will see just a plain JSON with the website's information.",
"WDR is a format to separate the website's **information** and **design**.",
"The website is readily available to be consumed outside the browser via JSON, but also still presentable to users accessing through the web browser."
]
},
"subscribe": "I'm creating a **blog platform** using this concept. Follow me on [Twitter](//twitter.com/gpiresnt) to be notified when is ready! ",
"how": {
"title": "## How it works",
"description": [
"It works by embedding a small initiator at the end of the JSON file.",
"For example, this is a valid JSON and web page:",
"```{\n \"title\": \"Example Page\",\n \"description\": \"This is an example.\",\n \"#render\": \"<html hidden><script src=/render.js></script></html>\"\n}```",
"The script `render.js` receives the JSON as input and is responsible to render the page."
]
},
"usage": {
"title": "## Usage",
"description": [
"First create an HTML file with the JSON information:",
"```{\n \"title\": \"Example Page\",\n \"description\": \"This is an example.\"\n}```",
"Include the initiator at the bottom:",
"```{\n \"title\": \"Example Page\",\n \"description\": \"This is an example.\",\n \"#render\": \"<html hidden><meta charset=utf-8><script src=/render.js></script></html>\"\n}```",
"The next section explains how to create the `render.js`."
]
},
"only-data": {
"title": "** Pure information **",
"description": [
"The JSON must contain only pure information without any concern about design or markup. All the design and markup required to render the page must be inside the rendering script."
]
},
"create-render": {
"title": "## Creating a render",
"description": [
"Create a new javascript project and install the package `wdr-loader`:",
"```npm install wdr-loader```",
"Call the `loader` function to retrieve the JSON:",
"```import loader from 'wdr-loader';\nloader(data => render(data));```",
"Create a `render` function to handle the data and render the HTML. Below is an example using simple `innerHTML`:",
"```function render(data) {\n document.head.innerHTML = `\n <meta charset=\"utf-8\">\n <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\" />\n <title>${data.title}</title>`;\n\n document.body.innerHTML = `\n <section>\n <h1>${data.title}</h1>\n <p>${data.description}<p>\n </section>`;\n}```",
"The `wdr-loader` code is available on [GitHub](//github.com/webdatarender/wdr-loader) with an example."
]
},
"basic-render": {
"title": "## Basic render",
"description": [
"If you don't want to create a render right now, it is available a basic render (used on this very website) to immediate use:"
],
"download": "[render-basic-1.0.3.js](//webdatarender.com/dist/render-basic-1.0.3.js)",
"instructions": [
"Just download the script and include it on the initiator directly:",
"```{\n \"title\": \"# Example Page\",\n \"description\": \"This is an example.\",\n \"#render\": \"<html hidden><meta charset=utf-8><script src=/render-basic-1.0.3.js></script></html>\"\n}```",
"The code is available on [GitHub](//github.com/webdatarender/wdr-render-basic)."
]
},
"remarks": {
"title": "## Remarks",
"description": [
"• The page is rendered in [quirks mode](//developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode) and can present some layout differences on different browsers.",
"• Although javascript is necessary to render the page, most search engines, like [Google](https://developers.google.com/speed/pagespeed/insights/?url=webdatarender.com) or [Bing](https://www.bing.com/webmaster/tools/mobile-friendliness), will be able to read the page correctly.",
"• If you want to display the JSON for users that have javascript disabled, you can include `noscript` at the initiator: ```<noscript><style>html{display:block !important; white-space:pre}</style></noscript>```"
]
},
"support": {
"title": "## Need Help? ",
"description": "If you need help, have any feedback or just want to say hi, send me an [email](mailto:gpiresnt@gmail.com)."
},
"about": {
"creator": "Created by [@gpiresnt](//twitter.com/gpiresnt)",
"logo": ""
},
"#render": {
"_": "<html hidden><meta charset=utf-8><script src=/dist/render-basic-1.0.3.js></script></html><noscript><style>html{display:block !important; white-space:pre}</style></noscript>",
"css": "css/main.css"
}
}
```But it would be even better if they could get the information directly on JSON. So much carbon saved ;)