Copying my thoughts from there which haven't changed:
>To which I say, are you really going to avoid using a good tool just because it makes you puke? Because looking at it makes your stomach churn? Because it offends every fiber of your being?"
Yes. A thousand times yes. Because the biggest advantage of Markdown is that it's easy to read, and its second-biggest advantage is that it's easy to write. How easy it is to parse doesn't matter. How easy it is to extend is largely irrelevant.
Markdown may or may not be the best tool for writing a book, but Markdown is the best tool for what it does - quickly writing formatted text in a way that is easy to read even for those who are not well versed in its syntax.
I don't want to write a book. If I did I'd use LaTeX before RST. I want something to take notes, make quick documentation and thread comments.
Moreover, simple, human readable parsing rules help a lot with reducing cognitive load of the form and focus on the content. Extending a syntax necessarily brings abstractions and more complex parsing rules which would conflict with that goal. In some contexts minimalism and simplicity are features in themselves.
For me, I often want to spend my time writing down the stuff I need to write and not play with extensions/logic/configs. I like that it forces me to actually not be able to do sth more complex because I am pretty sure that if I was incentivised to extend it instead, I would end up spending my time with that instead of writing.
Markdown is not good for stuff where complex logical structure in the content is important to be represented in the form. In the article it is beyond clear to me why the author did not use markdown for their book, I would be more interested in why they chose RST instead of latex or another language that is more towards the complex end than the minimalistic end. I guess what the author needed was some point in-between, and they found it in RST.
> Yes. A thousand times yes.
Your comment comes off as if it makes an opposing point to the article. My apologies if it wasn't meant that way.
But I want to note that the author agrees with you! The next sentence from the author which you didn't include in your quote says:
> Okay yeah that's actually a pretty good reason not to use it. I can't get into lisps for the same reason. I'm not going to begrudge anybody who avoids a tool because it's ugly.
How easy it is to parse does matter, because there’s a definite correlation between how easy it is to parse for the computer and for you. When there are bad corner cases, you either have to learn the rules, or keep on producing erroneous and often-content-destructive formatting.
> How easy it is to extend is largely irrelevant.
If you’re content with stock CommonMark, it is irrelevant to you.
If you want to go beyond that, you’re in for a world of pain and mangled content, content that you often won’t notice is mangled until much later, because there’s generally no meaningful way of sanity-checking stuff.
As soon as you interact with more than one Markdown engine—which is extremely likely to happen, your text editor is probably not using the parser your build tool uses, configured as it is configured—it matters a lot. If you have ever tried migrating from one engine to another on anything beyond the basics, you will have encountered problems because of this.
Edge cases largely don't matter, because again I'm not trying to make a book. I don't care if my table is off by a few pixels. 50% of the time I'm reading markdown it's not even formatted, it's just in raw format in an editor.
I’m not sure that’s true tbh. Exhibit A: natural language. Exhibit B: Polish notation.
This is incorrect. You can sure write LaTeX that is intricately dependent on the output dimensions. But you can just as easily write LaTeX that is independent of output dimensions.
Case in point is compiling LaTeX doc to HTML which you'd admit is easily resizable.
Case in point is also writing LaTeX docs for journals or publication where you can easily resize the document to match the publisher's style guide and dimensions by changing the documentclass.
The TeX philosphy rejects that. When TeX can't format a paragraph beautifully, it emits diagnostics like "overfull \hbox".
That's totally incompatible with being able to dictate a width, and expecting things to fit without having to get involved.
He should compare it to HTML or XML or Haml
My personal resume is a Lisp thing (now well over 20 years old). There is a kind of markup language, and CLOS-driven back ends for producing different output formats.
I'm not sure if <img src="file.jpg" alt="alt text"/> is less readable than
.. image:: file.jpg
:alt: Alt text
HTML5 allows for leaving certain tags unclosed (such as <li>, or <head> or even <p>) to such an extent that I find many template languages to not be worth the effort of their complex syntax.Sure, there are three or four lines here that you can omit using RST or markdown:
<!doctype html>
<html lang="en">
<head>
<title>My blog page</title>
<body>
<h1>Welcome to my blog</h1>
<p>This is a bunch of text.
Feel free to stuff newlines here.
<p>This is also a bunch of text
<p>Here's a list just for fun:
<ol>
<li>This is the first item!
<li>This is the second one!
<li>Boom, a third!
</ol>
<p>Have an image: <img src="filename.jpg" alt="alt text goes here">
But is having to wrap a list in <ol> and closing the <title> really that bad?Automatically generating an index and such is nice, but five lines of Javascript can do the same. Plus, you don't need to run a second tool to "process" your input.
I generally use Markdown as a standardised way to format text that will probably be read in plaintext by other people, but when it comes to formatting documents, I don't see the point of most complex template languages.
Once you've written a couple of documents, the usual tags become muscle memory and are no more of a bother to write than markdown. I've even created a couple of nano macros to automate some of the process.
"But it's not readable like markdown" you might say. Well. This might be true of 'some' html, especially autogenerated stuff, but the stuff I write is totally readable. Once you settle on some meaningful indentation and tag presentation conventions, readability is not a problem. We're talking about plain html documents, after all, not complex websites. The subset of html tags you'll need is generally very small and largely unintrusive.
I could even go a step further and say, my HTML is as readable as this guy's rST, but this guy's generated HTML code is far worse than how my direct HTML would have looked.
Eric Raymond's, in his 2003 book, advocating terse text formats in chapters 5 and 18 https://www.catb.org/esr/writings/taoup/html/graphics/taoup....
Markdown is ubiquitous because it’s easy for humans to read and write.
The second part is more important than the first. There could be far better systems which not enough humans used to make ubiquitous. And as far as we know, markdown could be one of the worse ones, but became ubiquitous because it became ubiquitous.
cf: MS Windows.
What Gruber got right is that the syntax is beautiful to read, easy to write and powerful enough to be useful, with the optional inline HTML as an escape hatch. It may not seem much, but that's hard to get right.
markdown is ubiquitous thanks to github.
[1] https://talk.commonmark.org/t/foo-works-but-foo-fails/2528
How does whether you think of the language as agglutinative affect the usability of reST?
The biggest problem that occurs to me is that there isn't really a conceptual difference between an "agglutinative" language in which you have very long words expressing complex meanings, and an "isolating" language in which the same syllables occur in the same order with the same meaning but are thought of on a Platonic level as being all independent words.
This is because an "agglutinative" language is one in which syntax markers are more or less independent of any other syntax markers that may apply to the same word†, which means it's always possible by definition to consider those markers to be "words" themselves.
Would your problems be solved if you viewed what you had considered "long" Korean words as instead being several short words in a row? What difficulties does agglutination present?
† Compare: https://glossary.sil.org/term/agglutinative-language
> An agglutinative language is a language in which words are made up of a linear sequence of distinct morphemes and each component of meaning is represented by its own morpheme.
https://glossary.sil.org/term/isolating-language
> An isolating language is a language in which almost every word consists of a single morpheme.
I think SIL's definition is, while robust, not the usual definition because English can be regarded as agglutinative in this definition. This is particularly visible from the statement that most European languages are somewhat fusional [1], which is okay under their definitions but not the usual way we think of English.
In my understanding, the analyticity is a spectrum and highly analytic languages with most (but not necessarily all) words containing just one morpheme are said to be isolating. Words in agglutinative languages can be, but not necessarily have to be, analyzed as a main morpheme ("word") with dependent morphemes attached ("affixes"). Polysynthetic languages go further by allowing multiple main morphemes in one word. As languages tend to become synthetic (as opposed to analytic), the space-separated "word" is less useful [2] and segmentation gets harder and harder. reST's failure to support those languages is all about a bad assumption about segmentation.
[1] https://glossary.sil.org/term/fusional-language
[2] So much that several agglutinative languages---in which space-separated words can still be useful---don't even think about spacing, e.g. Japanese.
this is **bold** text
this is :strong:`bold` text thisis:strong:`bold`text
Whereas the equivalent is perfectly fine in markdown.Falsehoods programmers believe about written language: whitespace is used to separate atomic sequences of runes.
Both do at least some degree of only matching delimiters at word boundaries. I consider that to be a huge mistake.
reStructuredText falls for it, but has a universally-applicable workaround (backslash-space as a separator—note that it is not an escaped space, as you might reasonably expect: it’s special-cased to expand to nothing but a syntax separator).
Markdown falls for it inconsistently, which as a user of languages that divide words with spaces, is honestly worse. Its rules are more nuanced, which is generally a bad thing, because it makes it harder to build the appropriate mental model. It was also wildly underspecified, though that’s mostly settled now. For many years, Stack Overflow used at least two, I think three but I can’t remember where the third would have been, mutually-incompatible engines, and underscores and mid-word formatting were a total mess. Python in particular suffered—for many years, in comments it was impossible to get plain-text (i.e. not `-wrapped code) __init__.
In CommonMark, _abc_ and *abc* get you abc, but a*b*c gets you abc while a_b_c gets you a_b_c. That’s an admission of failure in syntax. Hmm… I hadn’t thought of this, but I suppose that makes _ basically untenable in languages with no word separator. Interesting argument against Prettier, which has a badly broken Markdown mode¹, and which insists on _ for emphasis, not *.
In my own lightweight markup language I’ve been steadily making and using for my own stuff for the last five years or so, there’s nothing about word boundaries. a*b*c is abc, and if a dialect² defined _ as emphasis, a_b_c would be abc.
Another example of the cleverness problem in reStructuredText is how hard wrapping is handled. https://docutils.sourceforge.io/docs/ref/rst/restructuredtex... is a good example of how badly wrong this can go. (Markdown has related issues, but a little more constrained. A mid-paragraph line starting with “1. ” or “- ”—both plausible, and the latter certain to occur eventually if you use - as a dash—will start a list.) The solution here is to reject column-based hard-wrapping as a terrible idea. Yes, this is a case where the markup language should tell people “you’re doing it wrong”, because otherwise the markup language will either mangle your content, or become bad; or more likely both.
Meanwhile in Markdown, it tries to be clever around specific HTML tags and just becomes hard to predict.
—⁂—
¹ Prettier’s Markdown formatting is known to mangle content, particularly around underscores and asterisks, and they haven’t done anything about it. The first time I accidentally used it it deleted the rest of a file after some messy bad emphasis stuff from a WYSIWYG HTML → Markdown conversion. That was when I discovered .prettierignore is almost completely broken, too. I came away no longer just unimpressed with some of Prettier’s opinions, but severely unimpressed with the rest of it technically. Why they haven’t disabled it until such things are fixed, I don’t know.
² There’s very little fundamental syntax in it: line break, indent and parsing CSS Counter Styles is about it. The rest is all defined in dialects, for easy extension.
still markdown just isn't powerful enough for anything non trivial.
I see this sentiment a lot, and my reaction is always, “Sure it is, with asterisks.” In the past decade I was the primary author of the RethinkDB documentation, a senior technical writer on Bixby’s developer documentation, and am now a contractor working on Minecraft’s developer docs. All of them were large, decidedly non-trivial, and Markdown. Microsoft’s entire learning portal, AFAICT, is in Markdown.
And the thing is, each of those systems used a different Markdown processor. My own blog uses one that’s different from all of those. According to HN, I should be spending virtually all my time fighting with all those weird differences and edge cases, but I’m not. I swear. The thing about edge cases is they’re edge cases. I saw a “Markdown torture” document the other day which contained a structure like this:
[foo[bar(http://bar.com)](http://foo.com)
and proudly proclaimed that different Markdown processors interpret that construct differently. Yes, okay, and? Tell me a use case for that beyond “I want to see how my Markdown processor breaks on that.”The asterisk is that almost any big docs (or even blogging) system built on Markdown has extensions in it, which are usually a function of the template system. Is that part of Markdown? Obviously not. Is it somehow “cheating”? I mean, maybe? At the end of the day, 99% of what I’m writing is still Markdown. I just know that for certain specific constructs I’m going use {{brace-enclosed shortcodes}}, or begin an otherwise-typical Markdown block quote with a special tag like “%tip%” to make it into a tip block. Every system that proclaims it’s better than Markdown because it allows for extensions, well, if you take advantage of that capability, look at you adding site-specific customization just like I’m doing with (checks notes) Markdown.
If reStructured Text works better for you, or AsciiDoc, or Org Mode, great! Hell, do it all in DITA, if you’re a masochist. But this whole “this is obviously technically superior to Markdown, which surely no one would ever do real work in, pish tosh” nonsense? We do. It works fine. Sorry.
I haven’t checked if any of the details have changed any time recently, but Zola does this, and I had a rough time with it because of the interactions with Markdown rules around raw HTML and escaping and such. I have worked to forget the details. I reckon Zola bakes Markdown in too deeply, and it’s a pain. Especially because of indentation causing code blocks, that’s one of the biggest problems with extending Markdown by “just writing HTML”.
Except each one actually parses a slightly different language.
https://git.sr.ht/~xigoi/markdown-monster/blob/master/monste...
Markdown is for the people, almost never full time doc jockeys, who need to WRITE that documentation.
At some point during that phase I tried org mode and it's better than both, it is easier to read/write than RST, and better for large documents than Markdown. Unfortunately it doesn't get accepted in as many places as Markdown.
There was a significant learning curve getting good output when converting some of the old ASCII charts out of .txt files, but once settled it makes for a much better user experience and it auto-compiles to HTML, PDF, and even EPUB with zero additional effort.
I would definitely not want to go to Markdown from RST for technical documentation that's more complex than a Github readme.
It's basically markdown, but made to be easier to parse with explicit support for nice addons such as tables, divs, and attributes.
asciidoc > rst > markdown
It’s just that the available tooling goes the opposite way,
markdown tooling > rst tooling > asciidoc tooling
I end up using HTML for anything serious instead, because it has better tooling support than any of the three, and is also more flexible. It’s just more verbose, which is fine.
It would be nice if emphasis and other inline formatting worked smoothly even in agglutinative languages...
I like the framework, but it ended up being too in the way. I am not an RST maintainer. I want to blog and get my thoughts out in the world.
I split my website to use different subdomains, and most of the posts in that old blog are now in https://tech.stonecharioteer.com which is on Hugo now. I used Claude to fix some Css annoyances with the Paper mod theme, and to migrate not only the posts from that old blog but also from the Jekyll blog that predates it.
I'm happy with the blog now, it's so out of my way that I can write without trying to figure out how to make Hugo do something like Sphinx-style admonitions. Claude is great for that. What else is there to complain about?
.. image:: example.jpg
:alt: alttext
That is some horrendous syntax.I totally get the author’s power user needs, and the article states plainly that this isn’t for everyone, but there’s gotta be something with power AND earthly syntax, right?
Also the author has very bad taste in having used two spaces of indentation. It should have been three, which makes it significantly less ugly:
.. image:: example.jpg
:alt: alttext
“.. ”: this block is magic.“image::”: directive, type image.
“example.jpg”: directive argument, file name.
“:alt: alttext” directive option, named alt, value alttext.
Rewritten with a completely different sort of syntax, for fun:
┌ IMAGE ─ example.jpg ──┐
│ alt = alttext │
└───────────────────────┘And “..” is just your “something cool is about to happen” symbol?
I’ve been reading through the documentation more and this thing seems insane.
That .. symbol is also used for comments!? “Oh if it’s invalid it’s a comment!” No way to make multi-line comments without another ridiculous indentation.
The tables are insane, somehow they implemented something worse than markdown.
To make a header you need to make a long line of characters as long as your text for some reason.
For being a language that’s supposed to be more powerful than markdown and not so HTML-adjacent it sure depends on whitespace a lot. Like, why do literal blocks need be indented? Why do doctest blocks need to end with a blank line?
> The handling of the :: marker is smart:
> If it occurs as a paragraph of its own, that paragraph is completely left out of the document.
> If it is preceded by whitespace, the marker is removed.
> If it is preceded by non-whitespace, the marker is replaced by a single colon.
lol, a directive that does 3 entirely unrelated things depending on white space. Genius.
Do you think it makes easy things easy and complex things possible?
Why not "hardcode" the most common things to be the easiest to use and then still have the option to extend to other protocols? Why "suffer" every time equally instead?
But it's overkill for light documentation. Just look at their first example of embedding an image:
> 
vs
> .. image:: example.jpg
> :alt: alttext
In the first one, it's just the syntax for a hyperlink with ! in front.
In the second one, there are several bits of syntax to remember. We have .. and then whitespace, and not one but two colons after `image`, and it's not `alt:` but `:alt:`.
I don't have to try to remember Markdown syntax, because it's simpler and it's ubiquitous. I type Markdown directly into Slack and Obsidian every day. Most tech-adjacent people know some Markdown.
Many years back a developer on my team decided that all the readmes that live next to source code should be in RST, because it's Better(TM) and we could have nicely formatted generated docs. The result was that a lot less documentation got written, and nobody looked at the generated docs anyway. Eventually we transitioned back.
Everything can be extended with fenced block.
RST is a lot more difficult to write and much more "groffy".
This is `a link`_
.. _a link: https://foo.com
The underscores are required exactly like that. I believe the blank line between is also required. There's also an inline syntax where you use two trailing underscores: This is `an embedded link <http://foo.com>`__
I'd rather write raw HTML.