[1] http://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
[2] https://christianhaider.de/dokuwiki/lib/exe/fetch.php?media=pdf:pdf32000_2008.pdf
[3] https://web.archive.org/web/20220309040754if_/https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
Not to mention ISO unsurprisingly host it, which I would also consider authoritative: https://www.iso.org/obp/ui/#iso:std:iso:32000:-1:ed-1:v1:en
[0] https://www.loc.gov/preservation/digital/formats/fdd/fdd0002...
It must have been hard guessing at the spec until you could read it properly.
"Although it is an open standard, one major difference compared with prior versions of PDF is that ISO now holds the copyright to the PDF specification and thus PDF 2.0 is not freely downloadable." [0]
It looks like DMCA requests are being issued to anyone that hosted the old specification, even open source projects [1].
[0] https://www.pdfa.org/resource/iso-32000-pdf/ [1] https://github.com/Hopding/pdf-lib#git-history-rewrite
It's funny how CMSes tend to offer "clean URL" configurations (meaning that everything after the origin is 100% controlled by the CMS user) for requests served dynamically (database queries) but requests served statically (public files on disk) often end up containing implementation-specific junk (e.g., "/sites/" in the case of Drupal). The magic that makes clean dynamic URLs (rewrite everything that isn't a file to the boot script) should be expanded to make clean file URLs. Serving files would then need help from a script+db, but so what, that already happens for private files.
Obviously embedded assets that need to be fast (images, stylesheets, scripts, etc.) can't have a slow db query in the way. I'm only talking about files that are a first-class destination in the browser's address bar, like PDFs, and anything where the disposition is that it lands in your Downloads folder. Stuff that might be a search result or otherwise linked-to.
(I have used that document a lot to write a custom PDF generator and parser in Java, using a downloaded copy)
I wish there was an EPUB version of the document. Do PDFs support reflowable content?
Why anyone would use such a format for these situations, where the audience definitely cares way more about consuming it on an electronic device than printing it out, is... mind-boggling.
Of course, AI+ML to the rescue: Liquid Mode [0].
> Files are processed in our secure data servers and immediately deleted from our servers after the experience is generated.
[0] https://www.adobe.com/devnet-docs/acrobat/android/en/lmode.h...
[1]: Essentially a PDF with its own EPUB inside it, but unlike just having an attached EPUB, there is a map between the page layout of the PDF and the tags.
There are implementations of reconstructive reflowing that infer the layout block structure and reading order and can reflow a two column paper into a single column.
There are some problems of the spec though, and navigation is not the most pressing one. The spec is huge, support for less used parts is spotty in various PDF readers. It also has inaccuracies (not corrected in errata) and underspecified parts.
How so? I frequently reference specific sections, tables or pages of the spec at work.