The older work I was aware of is on "The design and implementation of a log-structured file system" (1)
So this is with pleasure that I learned that these ideas was around in the 80:
- Deletion considered harmful
- A non-deletion strategy using timestamps
- The importance of accessing past data
- A non-deletion strategy can improve both integrity and reliability
Though many were thinking about these ideas in the 88-92 timeframe, as Tape storage systems are roughly speaking append only, so lots of the ideas of a logged filesystem are around the increased random read from disk drives.
(in particular, "new master = old master + updates" card/tape jobs were in principle append-only but —due to finite number of tapes— in practice overwriting)
The 1980 paper you linked is touched on briefly at the beginning of this Strange Loop talk on "Light and Adaptive Indexing for Immutable Databases (2022)": https://www.youtube.com/watch?v=Px-7TlceM5A
The idea was to reduce the cost of storage by removing long term data from costly hard disks and storing it on cheap magneto-optical disks which like CD's could be stored in an automated juke box. Write all the data you want to the cache, then commit to worm. As the worm fills, you just buy another disk and put it in the jukebox. The history(1) command then gives you a files history as a set of paths you can bind over another path to use an old version of a file instead of copying it. Its really a file system for programmers. http://doc.cat-v.org/plan_9/4th_edition/papers/fs/
This idea was expanded on with Venti/Fossil which allows you to build file systems from arbitrary venti data sets. http://doc.cat-v.org/plan_9/4th_edition/papers/venti/
Wild view from where we sit today, but CDs were ~700MB in 1982. Seagate launched a 5MB hard drive in 1980 so.... not entirely absurd to think that `just don't delete things` could be the way of the future. We sorta adopted `just don't delete things` anyway though not with respect to RDBMS systems.
Thanks for sharing!
In PC Magazine from July 1988 there is an advert for a 15MHz XT for $575 with an optional 30MB Segate ST238 5.25" scsi hard drive inside for an extra $295 [0]
The price hasn't dropped much since, it's now $206 for the drive [1]
[0] https://archive.org/details/PC-Mag-1988-07-01/page/22/mode/2...
[1] https://www.amazon.com/ST238R-Seagate-3600RPM-Internal-Drive...
> In the 1980's we used 14 inch drives in our DEC VAX cluster. Each 14 inch drive had a capacity of about 450MB
Those platters were definitely the 14" size, and there was more than one of them in the refridgerator-sized unit. At this temporal distance, I can't guess what the overall size was anymore, but it was clearly not 1MB.
Sorry for the misleading post. Still, it was quite a day, regardless.
Bonus: I got an RLL controller and turned it into a 30MB hard disk! Couldn't believe it. But getting the interleaving right was time consuming..
3.5" HDs of over > 20 MB for IBM PCs were around in IBM PCs at the time.
https://i.pinimg.com/originals/0d/b3/b6/0db3b67dcdd2edbedd5c...
Classic!
Btw: You need about 12 TB for a 1 year video stream at 3 Mbit/s, so it's certainly doable, but it's not cheap.
Interestingly, Google and Facebook seem to have basically done it right with their exascale filesystems. The same with object stores.
A major benefit of append-only is that your writes are always ideal for whatever storage medium. Especially magnetic or tape. Combine append-only with batching of transactions (i.e. across 1-10 milliseconds at a time), and you can write multiple txns per disk I/O operation (assuming txn size < storage block size).
You can search for some comments I made recently about an append-only database scheme.
> everything would be append-only by default
doesn't mean everything is permanently etched into stone or written on the blockchain, it just means that "by default" everything you write is written to a new block[1], instead of having to free up old blocks to reuse and keep track of which blocks of storage are available
[1]"block" is just meant as a generic unit of storage, I'm not trying to say anything about actual drive blocks and implementation details
Do you know why ASCII 0x7f is DEL? Paper tape is write-once. To indicate a deleted character, it was conventional to punch holes in all bit positions -- 0x7f on a 7-bit punch.