For blog posts and notes I see the appeal, since the boilerplate can be a hindrance to spontaneous writing.
Also, Latex can't produce any output which is accessible to blind people (other than giving them the raw LaTeX). The PDFs latex produces are probably the least accessible format available (much worse than a word proeuced pdf, or some html). This matters to me, and should matter more to other people (in my opinion).
I don't find the boilerplate minimal at all. Contrast the following:
\begin{itemize}
\item First
\item Second
\item Third
\end{itemize}
with - First
- Second
- Third
I won't even get into the hell that is tables.I loved LaTeX until I discovered Org Mode. Pandoc also scratches the same itch.
If one is going to write LaTeX code anyway, it seems easier and cleaner to use LaTeX all the way, move all the boilerplate along with the personal template to say, a file named preamble.tex, and \input{preamble.tex} in the documents.
However, there are situations where Pandoc can be convenient. For example, I wanted a document[1] to be written primarily as README.md (CommonMark format), so that GitHub could render it as the project README. At the same time I wanted to render a PDF output from a customized form of the content. Pandoc is convenient for cases like this although it takes a bit of work to fine-tune the formatting and customize the content for each output format.
It's not for everyone, but emacs+auctex really reduces the latex boilerplate (at least writing it) that I don't really feel it's a hindrance.
Incidentally, I really like the thoughtful syntax additions Pandoc makes over olde Markdown (eg., tables, definition lists, and span & div syntax as well). Such a great all-around doc tool.
Once the work has moved into a Word file, isn't that where it stays? Editors and publishers often make heavy use of features like track changes and notes. Doesn't pandoc lose that information?
Scribble Code Example:
#lang scribble/base
@title{On the Cookie-Eating Habits of Mice}
If you give a mouse a cookie, he's going to ask for a glass of milk.
@section{The Consequences of Milk}
That ``squeak'' was the mouse asking for milk. Let's suppose that you give him some in a big glass.
He's a small mouse. The glass is too big---way too big. So, he'll probably ask you for a straw. You might as well give it to him.
@section{Not the Last Straw}
For now, to handle the milk moustache, it's enough to give him a napkin. But it doesn't end there... oh, no.
Scribble -
Scribble is a collection of tools for creating prose documents—papers, books, library documentation, etc.—in HTML or PDF (via Latex) form. More generally, Scribble helps you write programs that are rich in textual content, whether the content is prose to be typeset or any other form of text to be generated programmatically. - https://docs.racket-lang.org/scribble/
Some languages based on Scribble
Skribilo -
Skribilo is a free document production tool that takes a structured document representation as its input and renders that document in a variety of output formats: HTML and Info for on-line browsing, and Lout and LaTeX for high-quality hard copies.
The input document can use Skribilo's markup language to provide information about the document's structure, which is similar to HTML or LaTeX and does not require expertise. Alternatively, it can use a simpler, “markup-less” format that borrows from Emacs' outline mode and from other conventions used in emails, Usenet and text. https://www.nongnu.org/skribilo/
Pollen -
Pollen is a publishing system built on top of Scribble and Racket. So far, I’ve optimized Pollen for web-based books, because that’s mainly what I use it for. But it can be used for small projects too, and non-webby things like PDF.
As a publishing system, Pollen includes:
A programming language. The Pollen language is a variant of Scribble, with specific dialects tailored to different kinds of source files. You don’t need to use the programming features to do useful work, but they’re available when you need them.
A set of tools & libraries. Pollen can produce output in any format, but it’s especially useful for markup-style formats like XML and HTML.
A development environment. Pollen works with the DrRacket IDE. It also includes a project web server so you can dynamically preview and revise your publication. http://docs.racket-lang.org/pollen/Backstory.html
They are Domain Specific languages that excel at outputting awesome HTML and PDF. They really aren't markup but really they are a Macro system that is built on top of a full Lisp (Racket) It is easier and much more powerful then anything I have seen on Pandoc and Latex (I use Latex still for specific targets but not for general papers anymore).Racket has the best documentation period and it is because the documentation
Just a few links:
- Where everything is documented: http://pandoc.org/MANUAL.html
- If you have questions or suggestions: https://groups.google.com/forum/#!forum/pandoc-discuss
- Contributing to pandoc is also a great way to get your feet wet with Haskell. In my experience, very supportive community. See http://pandoc.org/CONTRIBUTING.html and for good first issues: https://github.com/jgm/pandoc/issues?q=is%3Aopen+is%3Aissue+...
Finally, a great feature, that hasn't been mentioned here, is pandoc filters. Basically, pandoc provides a way for scripts (in any programming language) to hook into the transformation pipeline and modify the document AST (similar to the HTML DOM) in-between the reading and writing steps. See http://pandoc.org/filters.html
Other day I thought about contributing to Yarn, the Javascript package manager, but the only way that I found to communicate with the developers were issues in GitHub. Since I didn't know if the feature I wanted would be well received, I just quit.
Overall great experience. Thanks for the great tool :).
# pandoc test.doc -o test.pdf
pandoc: Unknown reader: doc
Pandoc can convert from DOCX, but not from DOC.Is this underlining, and not redlining as defined in financial services? (redlining: differential pricing based on demographic makeup of a zip code or neighborhood)
Here's the build command for responsive.style[1]:
pandoc $file -f markdown -t html5 -H templates/header-prod.html -B templates/nav.html -A templates/footer-prod.html -o (echo "../$file" | sed '$s/\.md$/.html/') -s --data-dir=./ --highlight-style breezedark --variable=file:(echo "$file" | sed '$s/\.md$/.html/')
Works beautifully!1: https://github.com/tomhodgins/responsive.style/blob/master/s...
I wrote up a tool as well, with navigation and prev/next links: http://www.unexpected-vortices.com/sw/rippledoc/index.html
I built a pipeline to convert a Markdown file to publishing-ready files for ebooks, Kindle and paperback for my novel; the whole thing is described here: http://www.gabrielgambetta.com/tgl_open_source.html
My website itself is static, generated from a bunch of Markdown files, some HTML templates, and a bit of postprocessing. But most of the work is done by Pandoc.
Anyway, pandoc is great.
Luckily, LibreOffice can produce tagged PDFs. And unoconv is a convenient utility for doing this from the command line. So you can use pandoc to convert to a format that LibreOffice can consume, then issue a command like this:
unoconv -f pdf -e UseTaggedPDF=true mydoc.odt
I've tried it, and it works.ConTeXt is supported as well: `pandoc input.md -t context -o output.pdf`
watch: $(ALL)
while true; do \
clear; \
make $(WATCH); \
inotifywait -qr -e close_write .; \
done
"make watch WATCH=build" will now compile documents on every save. Works well for single documents, collections of documents or entire websites.[1] https://gist.github.com/timpulver/0d01285952b97deb70df6104cc...
There are a small number of corner cases that need to be spec'd out before CommonMark can declare a v1.0 release[2]. If you have the skills for this kind of thing, please weigh in!
[2] https://talk.commonmark.org/t/issues-we-must-resolve-before-...
https://github.com/ashton314/marked-man
It's just a one-liner: `pandoc -s -t man "$1" | groff -T utf8 -man | $PAGER`
(That was basically stolen from an answer to one of my questions on Stack Overflow—thanks to those who answered! :)
There are a few things (in latest version, 2.2.3.2) that don't really survive round-trip from markdown back to markdown:
- reference-style links (e.g. `[foo][f]`). They are converted to inline links e.g. `[foo](http://...)`.
- setext vs hashmark headers. `foo\n=====` will get converted to `# foo`.
- markdown allows for forced-linebreak <br>s to be added with two trailing blank spaces at the end of a line. Pandoc escapes these with a trailing `\` at the end of the line.
These are only occasional nuisances, but overall the documents (at least in my experience) are not butchered.
I also occasionally go from markdown to docx for the purposes of uploading to google-docs and copy/pasting large sections into other docs. This is the only markdown-to-google-docs workflow I've found that works to preserve formatting. It's never really butchered anything, except a few times the syntax-highlighting for code-blocks gets confused and keywords get the wrong colors.
You can choose whether reference links go at the end of the paragraph or the document.
Pandoc is seriously a great tool! I love the way it's designed and have found it useful off and on over the years. Truly marvelous for making information available in any needed format.
Example:
pandoc in.md -o out.html -V pagetitle="My Title" --to=html5 --template="my.html" --css "my.css"
The example converts a markdown file to HTML, using a given title, a template file, and a stylesheet file.The pipeline is also well implemented with Haskell, which is good for writing your own fast functional transformations.
I tried creating a workflow from Asciidoc through Pandoc to MS Word but that didn't work so well. Tables being the biggest issue.
It was a little work to set up the workflow with scripts etc, but being able to write the book in markdown and still having full control over the design was definitely worth it.
[0] sample here: https://patricklouys.com/professional-php-sample.pdf
An example of how easy this is and the styles I use for my personal blog: https://curious.observer https://github.com/davnn/curiousobserver
https://users.soe.ucsc.edu/~ivo/_posts/2015-03-12-repeatable...
http://gbraad.nl/blog/document-generation-using-markdown-and...
The problem I had was that latex was turned into images, but changing the font-size of the reader did not change the size of the images, making the text readable, but the maths barely readable.
This is something I would love to see happen though.
You can add some CSS to the generated EPUB to change that. But if your EPub reader supports MathML, you can do that with pandoc. See http://pandoc.org/epub.html#math
What editor do HN folks use? I wonder if there's a leaner editor out there with an equally nice distraction-free editing interface. Thanks in advance!
[0] https://github.com/euclio/vim-markdown-composerNot free, but a real pleasure to use.
https://pandoc.org/installing.html
> We provide a binary package for amd64 architecture on the download page. This provides both pandoc and pandoc-citeproc. The executables are statically linked and have no dynamic dependencies or dependencies on external data files.
I wish I'd known about this sooner. I don't spend much time with text documents outside the web, but when I do, pandoc handles the disparate formats admirably. The only inconvenience is when I update my system, there's guaranteed to be a huge pile of Haskell libraries to download.
+ Static websites from any input to html
+ Markdown & TeX & References to pdf for academia
+ Generating manpages for new tools
+ Generating ebooks
... Let's just say I get a bit lost when it isn't available.
Do any of your tools use long options (prefixed with a double dash)? If so, make sure you disable the "smart" extension, otherwise you might end up with en dashes.
For novels, I tend to just use Markdown, as kerning will be done in CSS.
For academics, I use LaTeX and Asciidoc together, but some paragraphs might be inserted in various other formats - whatever is easier. The build tool doesn't care what the format is, it'll take any input pandoc accepts.
I guess it's not as well-known as I thought.
a couple questions i have, seems firstly that old school .doc files are not supported, docx yes. unfortunately i still get a lot of docs in .doc format which seems to be microsoft's proprietary format (docx seems to be more open).
my second question is whether or not there's a filter for golang, most of my development is in golang, so i either need to call your cli as a forked process or best to have a native library. i have never worked with haskell so not sure if i can import a haskell library from golang directly. i imagine there'd need to be a golang wrapper around the cli.
lowriter --convert-to odt some-document.doc
odt is not the only supported target, but doc --libreoffice--> odt --pandoc--> plain seems to give better results than e.g. doc --libreoffice--> txt or doc --libreoffice--> docx --pandoc--> plain.Pages to anything else, please.
https://orgmode.org/worg/org-tutorials/org-spreadsheet-intro...
Sorry, I couldn't resist.
It's a conversion tool for existing formats.