File Systems: The Original Hypermedia (opens in new tab)

(jon.work)

79 pointspilgrim01y ago50 comments

50 comments

> if we always had hypermedia with directories and files, why hasn't the web evolved into a mesh of interconnected file systems?

It kind of has. URLs have the notion of paths, which are obviously strongly associated with the notion of file system hierarchies. People sometimes (sometimes accidentally) put their file systems directly on the web, see DirectoryIndex in apache for example.

> Why isn't a website just a remote directory on someone's computer that we can explore via a file browser?

Well now we are getting into the meat of it. To be a hypermedia requires the presence of hypermedia controls in a media. Hypermedia controls can be as simple as links, but the web introduced more sophisticated controls such as forms, and allowed HTML authors to specify more significant interactions beyond click-to-link.

IMO the uniform interface is the most interesting aspect of hypermedia, and that really emerged post Web. I like the authors concept of a file explorer enhanced with hypermedia ideas though and would be interested to see more details on it.

grumbel1y ago

The directly support you get with HTTP is rather terrible. It works ok for a human browsing a directory manually, but if you want to actually download one it becomes a mess, since you can't tell if "index.html" is a file or just something Apache generated on the fly. There is no "list directory" command in HTTP (there is PROPFIND in WebDAV) and there is no "download directory" in your browser, you have to fiddle with wget and friends.

It's one of the things I love about IPFS, it has native directories along with fuse services, so you can just `cd /ipfs/...` and browse around. That has a lot of beautiful side effects in that you no longer need .zip to package directories and for a lot of things you no longer even need to download anything, you just access them directly via your file system.

Especially with FTP being removed from browsers, we could really need a proper official successor that can act as an online file system.

65101y ago

How many forms do you need? It should be 10, 50 or 5 million standard forms and free-form ones that could be required to request standardization.

I one time tried to make a form that works with iphone auto fill just for name and address fields. Apple insists putting the house number on the end of the street name. House numbers may have multiple letters and streets are named after people who may have a single letter second name. Boulevard Peng Da nr 1 vs Boulevard Peng nr Da 1. This doesn't work.

During corona I needed a document to work my night shifts. I received a pdf with some form fields. After filling those out the document changed it self into a static permit. I had never seen such a thing before.

When writing blog software I ponder organization a good bit. My conclusion was that depending on the size, quantity and nature of the writings different methods would work best. however! As it grows it is hard to tell which one is currently the best formula and it is very easy to maintain the full set that I define as: Hierarchical tree, Categories, tags and search.

How to do the hierarchical tree is obvious.

A limited number of Categories that should be defined in advance (if possible)

You should have as many tags as possible. This comment could be tagged #history #writing #organizing #ideas #search #link #interface etc. The user can be exposed to a small sub set of sufficiently populated tags. The tag pages can be sorted by the amount of tags per word.

Search is just search but could have topical filters and use all of the before mentioned but could also take one (or more) articles as queries (and list the results under the text)

How and when to stitch on an LLM I don't know.

pilgrim0OP1y ago

> How many forms do you need?

You just need to rethink the form, which is also an inheritance from print media, as a possibly fuzzy dialogue. Incorporating multimodal input, extracting information from uploads, etc., is a huge accessibility win. This is totally achievable right now, it’s just a matter of viewing the "form" as an interaction designed to fulfill a predetermined goal. What actually complicates forms is handling conditions, state tracking, and synchronization, which can become arbitrarily complex. This sort of dynamic behavior can never be fully standardized. So special cases are always going to need special solutions.

> Hierarchical tree, Categories, tags and search.

I think it's important to highlight the distinction between exploration and retrieval. Tags and categories are perfect for powering indices and lists, enabling opinionated exploration interfaces. However, when it comes to open-ended retrieval, arbitrary conventions don't really help. Retrieval requires the searcher to have some prior information on patterns, which are media dependent.

To make the retrieval problem more manageable, information needs to be structured in a predictable manner. Patterns are powerful tools for enabling and enforcing memorization. For example, if I want to retrieve a specific academic paper, my search model will align with the typical information pattern of academic papers. This pattern encourages me to memorize key metadata, such as the publication year, author names, institutional affiliations, and keywords in the title. These elements form the standard metadata body for this type of content.

1 more reply

ianburrell1y ago

If you think that file system is a viable approach to web site, how would implement Hacker News? Hacker News is one of the old-school static-rendered sites. It may not even use a database.

But it needs to be dynamic. The ordering on the front page is dynamic. The vote counts are dynamic. What does the voting? These are all easier to have a database with values that query and then render the page.

The other issue adding new content. How would someone post a comment? How and where does it get written? How do you make sure they write to right place? How do you make it easy to use as typing in box and hitting button?

Dynamic responses are the special sauce of the web, they are why it is a success. Without it, it would be good-looking Gopher.

pilgrim0OP1y ago

> If you think that file system is a viable approach to web site, how would implement Hacker News?

Hacker news is a web application, with a client-server architecture. Indeed would be impractical to replicate with the file system model.

Still, I think it would be useful to have standardized hypermedia documents. It would allow for content that's naturally multimedia to be much more easily handled and distributed. I find it super weird that we need to create a 'website' just to have a multimedia, responsive document. It practically must be hosted on a server because nobody sends HTML around. Mind you that I used the file system just as a metaphor for what an offline-first hypermedia document model could look like.

ianburrell1y ago

One of the huge advantages of the web is that there is no distinction between static web sites and web application. The backend implementation can change per page based on what is needed.

The remote filesystem needs a server. HTTP is a much better protocol for serving multimedia documents. The remote file system protocols are slow because they are based on RPC and make lots of requests.

It is dangerous to use filesystem for sharing cause have to carefully limit write access. None of the current filesystem protocols do that well.

If you were designing something new, it would probably end up like WebDAV or S3, an object store that can be used like a file system. There is a place for distributed storage, like IPFS but better, but that wouldn't be a file system.

There is a standard for hypermedia documents, HTML. There were attempts to make it more semantic, like XHTML 2.0 or using RDF, and they failed. There were attempt to transform XML, XSL, and that failed.

People send HTML around, they upload it to servers like I did a few days ago. But HTML is usually generated, for me because Markdown is easier to edit and can use templates.

1 more reply

65101y ago

With torrents the number of seeds and leaches do something similar. It actually tells you more as crap is deleted rather than seeded. Most of HN is frozen archives. That you can still vote on things you may not comment on isn't all that useful. HN (while huge) is a collection of things that are easy to do on centralized platform. It could be much more complicated but simplicity is valuable.

I would have to think a bit to come up with something that could reasonably match the mature centralized architecture. First thought would be that if I like your comment I could chose to seed your most recent GB of comments. They would load faster for everyone for each who seeds them.

pilgrim0OP1y ago

You provided a great example of using the conscious choice to seed as a way to signal an upvote. What better way to promote something other than the commitment to redistribute the content? In the context of P2P, it occurred to me that it's also possible to design a fully decentralized comment system using a naming convention. Each comment would be a torrent named after a given scheme, making it possible to track and discover comments related to another torrent. Since seeding means upvoting, then the best comments would be the most seeded ones.

2 more replies

pilgrim0OP1y ago

Let me give a use case where an offline-first hypermedia document model would be useful. I used to work as an instructional designer, making art and photography courses. This sort of material requires a lot of media, practically of all kinds. You either have to know how to code or use some platform to build that kind of thing. In either case, there'll be friction between reconciling the data source and the final document/volume. If you use a managed solution like a platform, you'll be locking away the content on their servers. Exporting the raw data does nothing because it'll lose all structure—that's basically what happens when you try to export from Notion, from instance. Yes, you can export, but it's lossy, because available offline formats, even markdown, cannot express the structure and ergonomics available within the authoring environment. There's a disconnect between source and presentation. You'll be left with a scattered set of files. The other option is not much better: coding it yourself. You'll need a CMS, and your data will be either JSON, MD, XML, or worse, tabular within an SQL database. Then you'll have to develop a build system or make some sort of SPA. And you'll need to setup a webserver and configure it to distribute the content. This is absurdly complex. It makes producing such materials very expensive. You just can't distribute your course offline. You can't back it up easily, either.

Compare that with print media. Want to make a memo? Fire whatever word processor, write it down, export to PDF, share anywhere. Done. If hypermedia was easy like this, do you have any doubt that people would put it to good use? Do you realize how PDF might be overused simply because of the durability, simplicity and fidelity traits it has?

It's not a law of nature that hypermedia should exist only in the context of browsers. Neither that it has to use HTML, CSS and JS.

ianburrell1y ago

You are going to need code anyway. Something needs to render the structured data into the format that the browser can understand. People tried to do that with structured markup and languages like XSL. But now we do it with JavaScript and JSON.

Your example could be done with directory of media files and metadata in JSON. JavaScript generates pages from the JSON file. Browsers can serve from the filesystem. It would need to be static since updating the JSON safely is hard since filesystem don't provide atomic guarantees.

If you wanted it writable or larger, you would need web server running API for accessing something like SQLite for database. Running a web server is super easy these days, single command.

Instead of thinking about how would remake hypermedia, come up with a JSON metadata schema that can be dropped in directory of multimedia. Then have JavaScript to render it. There are lots of people, me included, that would like that for our music and image libraries.

1 more reply

robertlagrant1y ago

> It's not a law of nature that hypermedia should exist only in the context of browsers

It is sort of a law of hypermedia that different resources are different files, though, isn't it? I guess you could make a file format that embeds images and fonts and HTML in it, but why bother when we have PDF?

1 more reply

moritz1y ago

> You would create and manage content directly from the file explorer application, in the most natural way possible. This version of the web wouldn’t require users to learn advanced computer skills in order to participate.

My students at university (Gen Z) have no concept of the “file system”.

dialup_sounds1y ago

That's not even a generational thing. People have been (e.g.) saving everything to the desktop for as long as there have been desktops. "Managing files" has always been a subsidiary task to the things people wanted to use computers for.

pilgrim0OP1y ago

Inline, ordered multimedia is the backbone of all consumer information systems. So your students have internalized the the archetypal equivalent of file systems through a different vocabulary, such as tweet (for files) and threads (for directories)

esafak1y ago

Do they not save files on their personal computers (phones, laptops)?

TheNewsIsHere1y ago

A lot of folks in that demographic are what you might call “cloud natives”. Their hard drives are used for storing the software that connects them to Google Drive or OneDrive or what have you.

We grew up in a time when understanding file systems in terms of “a system for organizing your files” was not optional. Gen Z has grown up in a time when their data was a Google Drive search away from their fingers.

andyferris1y ago

I found it took a while to get to the point.

In the end, I actually agree with this! I have also been thinking about filesystems, which are trees of of dictionaries (directories) and blobs (files), and that there are many other examples of tree-like data. Data structures in our programs are tree shaped, perhaps with references/pointers to other parts of the tree. JSON is a tree of dictionaries (objects), arrays, and data (the primitive string/number/null). The arrays are ordered, much like the content inside a HTML/XML block.

I agree that adding an "array" style of directory to our file systems would be really cool. I've been toying with the idea of writing a FUSE driver that holds some structured data (possibly including arrays) and just converts the (integer) index into a string. The idea is that you could e.g. view and edit some JSON tree with the file explorer. And not just JSON - basically any piece of structured data that we have in our programs can be "viewed" as a FS this way (e.g. just convert structs into directories of fields). It could even be a pretty cool and universal debugger - a breakpoint could make the program pause and serve a FUSE driver and later continue when it is unmounted :)

And yes, exactly what follows from this is some program could be written to open some "directory" and render a "document" based on the contents. The filesystem supports links, so we have the "web" like experience.

The "document" angle does require adding a kind of directory with ordered array semantics rather than dictionary/map semantics. It's the first missing ingredient listed in the article. Though some filesystems use sorted dictionaries (b-trees or whatever) for directory maps so you could maybe hack this ordered semantics in that way.

The second missing ingredient listed in the article is the hypermedia part. I mean my computer is actually OK at inferring if a file is a movie or photo or text document, so we kind of have a way of dealing with that, too. The "blob of bytes is a narrow waste" thing is quite powerful. That said, sum types could be useful to demark different kinds of "stuff", and there's no reason an implementation of this idea couldn't support sum types as part of its data model.

pilgrim0OP1y ago

> I found it took a while to get to the point.

You're just a great understandeour

> I agree that adding an "array" style of directory to our file systems would be really cool.

I think this is sort of a low-hanging fruit people have slept on. We've proved list-based systems are extremely versatile for data structures (s-expr) and programming (lisps). What about for media in general? Everywhere I look I just see lists, with very minor stylistic distinctions between them. Of course there're abysmal infrastructural differences between chats, feeds and what not, but it does not invalidate a universal list-based frontend, similar to what you developed in your comment.

andyferris1y ago

True.

I didn't really think about (linked) lists much, only (flat) arrays, but maybe that's possible already with files laid out like:

/head /next/head /next/next/head ... etc

or whatever.

I'm not saying its _ergonomic_, mind you. I'd like my file viewer to lay out the list flat, for example. There might be technical limitains with the length of the file path. Etc.

syberant1y ago

I'd argue that this concept for media is most commonly known as "playlists" and unfortunately only used within data silos, e.g. a playlist of YouTube videos, a Spotify playlist, a TV season of episodes, a series of episodes, a trilogy, etc. (Yes, I'd argue that your music library of mp3 files is also kind of a silo, although portable) Heck, even a slideshow is arguably a playlist.

I agree that putting a playlist-like concept into, say, the filesystem would be an extremely interesting idea but I think a big danger is running into the same problem as hardlinks and symlinks. This problem is that if a file is "present" in multiple places (or playlists) deleting/modifying/moving it can have unforeseen consequences and it's hard to reason about (and if you copy the file now you get to invent a way to track different versions too!). I think this is also holding tagging filesystems back.

I'm currently writing a non-hierarchical FUSE filesystem and have been thinking about this list-directory concept but I'm still not completely sure how it would work, especially since I need to remain backwards compatible with the POSIX interfaces. Will probably have to just try it out (xattrs to the rescue?) and see what sticks I suppose...

A linkdump of interesting somewhat related stuff:

- https://newsletter.squishy.computer/p/knowledge-structures

- https://newsletter.squishy.computer/p/all-you-need-is-links

- https://thesephist.com/posts/search-vs-nav/

- https://karl-voit.at/2017/02/10/evolution-of-systems/ Especially "Information-Centric Systems"

- https://www.theatlantic.com/magazine/archive/1945/07/as-we-m... A classic 1945 article cited as inspiration by Ted Nelson, Doug Engelbart and Tim Berners-Lee.

- https://en.wikipedia.org/wiki/Content-addressable_storage

- https://www.nayuki.io/page/designing-better-file-organizatio...

1 more reply

andyferris1y ago

I should probably have mentioned that there already exists FUSE-based JSON FS mounters, for example this one is 13 years old:

https://github.com/calebcase/jsonfs

xnx1y ago

A usable equivalent of the file system is sorely missing from the web. Every email address should come with a place to publically share files. It could be as easy as https://user@emaildomain.com/

mongol1y ago

That is an http basic auth URL.

Borg31y ago

O really? It seems people forgot about userdir.path (or similar). So user can expose whatever he wants via: https://emaildomain.com/~user/

xnx1y ago

Exactly. This convention is virtually non existent on the web in 2025.

2 more replies

jedi33351y ago

Reading this I couldn't help but imagine of an alternate universe where Gopher won out in the early 90s, but with a more flexible presentation layer. Great writeup

pilgrim0OP1y ago

What a dream this would have been

p_ing1y ago

I struggle to parse this with every paragraph surrounded by a border. It feels unnatural and extremely distracting.

To the author's take of using a file system as an interconnected 'web', we have networked file systems today, typically clustered though.

We've also had the _concept_ of a media-rich file system, like WinFS [as an overlay to NTFS], which was dead before it was alive due to the WWW.

Networking file systems is _complex_. All vendors would need to agree to a common export model on top of their preferred file systems. Or users would need a specialized partition/overlay developed just for this purpose.

FSes are great, but they're not fit for WWW. Without a control plane, they lack any tooling that makes the WWW better -- redirection, access control, programmatic execution of content (ASP.NET, PHP, CGI ...), etc.

Ultimately this would be a complex solution. Just like many don't simply "open up" their web server to any and all traffic to any and all content, a file system would need to be carefully partitioned the same.

The time for file systems as the vehicle for WWW content is long since past. We have better ways to do things, better caching mechanisms, better performance [through CDNs], better security mechanisms, and so on.

...not to mention, I certainly don't want to open my personal computer's file system up to the Internet.

There would have to be a big leap in evolution of file systems across all major operating systems for the author's dream to come true. I would certainly be excited to see it, but we're talking about allocating _talented_ developers to create a new file system and certainly an open source file system. Like many file systems, it would take years to become a trusted file system to host any content of value.

In the mean time, the author can always investigate WebDAV. Slower than dog shit, but it's available with every major web server.

pilgrim0OP1y ago

Sorry for the poor experience with the current design, still experimenting.

I cannot disagree with you, you’re on point on everything when considering the file system as an OS component.

But if we entertain the thought of file system as a document model, or as a transactional data structure, it should come naturally that we can piggyback on the modern infrastructure, at the application level, to achieve the desired qualities.

This very website is an experiment on how this could be done. The main takeaway with my research is that we have much to gain if we leave presentation and layout concerns out of hypermedia documents, letting the client software decide on it, like we do with our editors and IDEs, choosing the theme and font we like, the information is the same no matter. To abandon the fetishism inherited from print media and to transact pure data is to make the web democratic. That’s precisely the recipe used by all social networks: standardized, systemic presentation of schematic payloads following a given ontological model. We need only to copy them with an open model

p_ing1y ago

> Sorry for the poor experience with the current design, still experimenting.

Never stop experimenting.

> file system as a document model, or as a transactional data structure, it should come naturally that we can piggyback on the modern infrastructure, at the application level, to achieve the desired qualities.

I'm having a little difficulty groking what you're trying to say, here. Can you explain it like I'm stupid? My mind immediately bounced to SQL storing binary data or a document DB of some sort.

> if we leave presentation and layout concerns out of hypermedia documents, letting the client software decide on it

Every browser engine does this today. Every engine developer has their own idea of how a standard should be implemented. Granted, this isn't end-user choice, which is probably what you're after. Well, sorta, you have some customization of fonts and [link] colors.

> To abandon the fetishism inherited from print media and to transact pure data is to make the web democratic.

It sounds like you would like some simple scheme that provided you the text and binary data (images) of the website which allowed you to manipulate them into the "newspaper" layout of your choice. Am I getting that correct?

1 more reply

pwg1y ago

> Sorry for the poor experience with the current design, still experimenting.

It also is simply a blank page when browsed with UBlockOrigin blocking all the java script from executing.

1 more reply

mickael-kerjean1y ago

> WebDAV. Slower than dog shit, but it's available with every major web server.

You lost me there. WebDAV is nothing more than HTTP calls with some XML data with a slightly different syntax than the S3 API. There is no fundamental stuff in the spec that command the protocol to be "slower than dog shit" as a file transfer protocol. Please prove me wrong with another argument than: "the particular server implementation I tried was dog shit"

groby_b1y ago

> This version of the web wouldn’t require users to learn advanced computer skills in order to participate.

The web doesn't require "advanced computer skills". (Unless you use non-flexbox CSS alignments ;) It is fairly trivial to create basic HTML files. SSG + MD have removed a lot of the remaining obstacles. Most web sites are structured files, just with a "compiler" and possibly a database to store the files.

But what they still do require is the ability to reason about structured data and its best configuration. And that is the truly hard problem, ever since Ted Nelson first talked about it.

It also requires us to reason about how to best make that data consumable for humans. It doesn't just magically "arise from the structure", as much as I wish it did. The web site is a clear example - the lack of understanding how humans consume info, and what helps/hinders, leads to odd boxes around each paragraph.

I still agree with the fundamental idea. The more structure we can encode in an easily graspable way, the easier it becomes to impose structure.

But even then, the fundamental advantage of the web over hierarchical file systems is the non-linearity. And yes, correct, hierarchies matter, but the fundamental point the article misses is that there isn't just _one_ hierarchy. Wikipedia is a great example here - it fundamentally cannot be expressed in a meaningful way as a tree, even though it has many hierarchies.

And hierarchies alone are insufficient. We've now learned, thoroughly I think, that hierarchical taxonomies always break down. If we're given to snark, Linnaeus took a good stab, he failed. In more practical terms, the emergence of "tags" has shown that we need a way to have non-hierarchical cross-cutting data.

I think for a discussion of the subject, there's value in separating a few topics:

* Presentation. The author is right, HTML made a grave mistake including that

* Local representation. Again, agreement here, giving a file system structure that allows to infer meaning for later presentation is super helpful. (See point about SSG/MD)

* Organization/Navigation: Any sufficiently complex set of data requires several separate overlaid structures to help humans navigate.

* Human psychology: We're bad thinking about relation schemes beyond trees & grids. That means our organization schemes need to mirror them at least partially so we don't break our head. Corollary is that any sufficiently complex set of data needs searchability.

There's probably more. It's a topic that's been brewing in my head for a while, you're getting a very rough first draft, sorry :)

pilgrim0OP1y ago

> It is fairly trivial to create basic HTML files. SSG + MD have removed a lot of the remaining obstacles

These are advanced computer skills IMO

> It doesn't just magically "arise from the structure", as much as I wish it did. The web site is a clear example - the lack of understanding how humans consume info, and what helps/hinders, leads to odd boxes around each paragraph

I appreciate the criticism. It's impossible to please everyone in terms of design, and I think your antipathy towards this particular style agrees with the general premises.

Regarding if disposition arising from structure is desirable or not, I think it's a matter of culture and habit. The time and complexity savings for authoring and publishing afforded by this model, for me, satisfactorily offsets whatever could be said that it misses in the aesthetic or funcional department, which can always be patched and improved. The positive feedback I had from interested users, all of them tech-illiterate, is what gave me the confidence to pursue investing in the research, and also made me realize that my insecurities towards its acceptability, which stemmed from sentiments quite similar to what you put forth as criticism, were mere whims. As far as experience and perceptions can be trusted, I believe serial multimedia has been proved as a viable format.

> And hierarchies alone are insufficient.

Not disputing that. The fact the document model is hierarchical does not mean the document system has to be. In fact it was never planned to be. There are many mechanisms in place affording hypernavigation, down to the design of the in-memory representation. Just haven't been implemented for lack of resources.

> you're getting a very rough first draft, sorry :)

I'd love to hear more. Feel free to ping me anytime if this is a subject you find exciting to discuss!

groby_b1y ago

> These are advanced computer skills IMO

To set up? Yes. Absolutely. But that's equivalent to asking users to install a filesystem before using the machine :) Past installation, it's "write text. It shall appear"

Nobody needs to hand-edit HTML any more. It has become, for better or worse, the assembly language of information design ;)

> The time and complexity savings for authoring and publishing afforded by this model, for me, satisfactorily offsets whatever could be said that it misses in the aesthetic or funcional department

I'm curious. Are you saying the boxes are helping you author/publish (in which case, please say more!), or am I wildly misunderstanding?

> As far as experience and perceptions can be trusted, I believe serial multimedia has been proved as a viable format.

Absolutely. But it doesn't create meaning by itself. It's merely a well-understood and simple way of organizing information. (I should've added "linear" to "tree" and "grid")

And just having a pre-defined structure doesn't give meaning in general. You'll need to conform to it, and you need to deal with the parts that just won't conform. (soft/hard links exist to satisfy a need, if we want to go back to file systems)

> I'd love to hear more. Feel free to ping me anytime if this is a subject you find exciting to discuss!

I just might take you up on that ;) The topic's near and dear to my heart. (Alas, it is not my main occupation, so... feel free to ping as well. I might fall off the face of the earth from time to time ;)

1 more reply

j / k navigate · click thread line to collapse

50 comments

recursivedoubts1y ago