undefined | Better HN

0 pointsbob10299mo ago0 comments

I am also assuming that Amazon intends for the Deep Archive tier to be a profitable offering. At $0.00099/gb-month, I don't see how it could be anything other than tape.

0 comments

simonw9mo ago

I wonder if it's where old S3 hard drives go to die? Presumably AWS have the world's single largest collection of used storage devices - if you RAID them up you can probably get reliable performance out of them for Glacier?

donavanm9mo ago

Not quite. Hardware with customer data or corp IP (eg any sort of storage or nvram) doesnt leave the "red zone"[1] without being destroyed. And reusing EOL hardware is a nightmare of failure rates and consistency issues. Its usually more cost effective to scrap the entire rack once its depreciated, or potentially at the 4-5 year mark at most. More revenue is generated by replacing the entire rack with new hardware that will make better use of the monthly recurring cost (MRC) of that rack position/power whips/etc.

[1] https://www.aboutamazon.com/news/aws/aws-data-center-inside

bob1029OP9mo ago

I still don't know if it's possible to make it profitable with old drives in this kind of arrangement, especially if we intend to hit their crazy durability figures. The cost of keeping drives spinning is low, but is double-digit margin % in this context. You can't leave drives unpowered in a warehouse for years on end and say you have 11+ nines of durability.

hinkley9mo ago

Unpowered in a warehouse is a huge latency problem.

For storage especially we now build enough redundancy into systems that we don't have to jump on every fault. That reduces the chance of human error when trying to address it, and pushing the hardware harder during recovery (resilvering, catching up in a distributed concensus system, etc).

When the entire box gets taken out of the rack due to hitting max faults, then you can piece out the machine and recycle parts that are still good.

You could in theory ship them all off to the backend of nowhere, but it seems that Glacier is all the places where AWS data centers are, so it's not that. But Glacier being durable storage, with a low expectation of data out versus data in, they could and probably are cutting the aggregate bandwidth to the bone.

How good do your power backups have to be to power a pure Glacier server room? Can you use much cheaper in-rack switches? Can you use old in-rack switches from the m5i era?

Also most of the use cases they mention involve linear reads, which has its own recipe book for optimization. Including caching just enough of each file on fast media to hide the slow lookup time for the rest of the stream.

Little's Law would absolutely kill you in any other context but we are linear write, orders of magnitude fewer reads here. You have hardware sitting around waiting for a request. "Orders of magnitude" is the space where interesting solutions can live.

lijok9mo ago

You don’t raid old drives as it creates cascading failures because recovering from a failed drive adds major wear to other drives

hinkley9mo ago

Only if you have low redundancy. RAIDZ is better about this isn’t it? And Backblaze goes a lot farther. They just decommission the rack when it hits the limit for failed disks, and the files on the cluster are stored on m of n racks, so adding a rack and “resilvering” doesn’t even require scanning the entire cluster, just m/n of it.

pcthrowaway9mo ago

This is less of a concern with RAID 6, and especially in Glacier's use case where reading any piece of data happens seldom, I'd expect it to be fine.

mappu9mo ago

My understanding is some AWS products (e.g. RDS) need very fast disks with lots of IOPS. To get the IOPS, though, you have to buy +++X TB sized SSDs, far more storage space than RDS actually needs. This doesn't fully utilize the underlying hardware, you are left with lots of remaining storage space but no IOPS. It's perfect for Glacier.

The disks for Glacier cost $0 because you already have them.

donavanm9mo ago

Since ~2014 or so the constraint on all HDD based storage has been IOPs/throughput/queue time. Shortly after that we started seeing "minimum" device sizes that were so large as to be challenging to productively use their total capacity. Glacier type retrieval is also nice in that you have much more room for "best effort" scheduling and queuing compared to "real time" request like S3:PutObject.

Last I was aware flash/nvme storage didnt have quite the same problem, due to orers of magnitude improved access times and parallelism. But you can combine the two in a kind of distributed reimplementation of access tiering (behind a single consistent API or block interface).

hinkley9mo ago

There’s a really old trick with HDDs where you buy a big disc and then allocate less than half of it. There’s more throughput on the first half of the disk, more tracks per cylinder so fewer seeks, and never having to read half the disk reduces the worst case seek time. All increase IOPs.

But then what do you do with the other half of the disk? If you access it when the machine isn’t dormant you lose most of these benefits.

For deep storage you have two problems. Time to access the files, and resources to locate the files. In a distributed file store there’s the potential for chatty access or large memory footprints for directory structures. You might need an elaborate system to locate file 54325 if you’re doing some consistent hashing thing, but the customer has no clue what 54325 is. They want the birthday party video. So they still need a directory structure even if you can avoid it.

cavisne9mo ago

http://www.patentbuddy.com/Patent/20140047261

Is tape even cost competitive anymore? The market would be tiny.

topspin9mo ago

One way to know is to see if new tape products exist, indicating ongoing development. As of May, 2025, LTO-10 is available, offering 30TB/75TB (raw/compressed) storage per cartridge. Street price is a bit under $300 each. Two manufacturers are extant: Fujifilm and IBM.

hinkley9mo ago

It's gone in cycles for as long as I recall and older devs around 2010 said it had been going on for as long as they could recall.

j / k navigate · click thread line to collapse

0 comments

simonw9mo ago

donavanm9mo ago

[1] https://www.aboutamazon.com/news/aws/aws-data-center-inside

bob1029OP9mo ago

hinkley9mo ago

Unpowered in a warehouse is a huge latency problem.

When the entire box gets taken out of the rack due to hitting max faults, then you can piece out the machine and recycle parts that are still good.

How good do your power backups have to be to power a pure Glacier server room? Can you use much cheaper in-rack switches? Can you use old in-rack switches from the m5i era?

lijok9mo ago

You don’t raid old drives as it creates cascading failures because recovering from a failed drive adds major wear to other drives

hinkley9mo ago

pcthrowaway9mo ago

This is less of a concern with RAID 6, and especially in Glacier's use case where reading any piece of data happens seldom, I'd expect it to be fine.

mappu9mo ago

The disks for Glacier cost $0 because you already have them.

donavanm9mo ago

hinkley9mo ago

But then what do you do with the other half of the disk? If you access it when the machine isn’t dormant you lose most of these benefits.

cavisne9mo ago

http://www.patentbuddy.com/Patent/20140047261

Is tape even cost competitive anymore? The market would be tiny.

topspin9mo ago

hinkley9mo ago

It's gone in cycles for as long as I recall and older devs around 2010 said it had been going on for as long as they could recall.

j / k navigate · click thread line to collapse