undefined | Better HN

0 pointspaulmd3y ago0 comments

> p.s. Scans are ridiculously fast.

this is really the lesson of SOLR. full-scan all the things, aggregate as you go, broadcast disk IO to multiple listeners.

why do a bunch of 4K random IO when you could full-scan at bus speed? yeah you can make the 4K random IO super fast but that's not where hardware is going, and it's also scalable/clusterable/shardable where RDBMS caps out at one machine and clustering is kinda ugly.

0 comments

yxhuvud3y ago

Huh? That is exactly where hardware is going. What has been missing is the parts in between, the ability to emit enough random IO i parallel to saturate the interfaces.

paulmdOP3y ago

"huh? high clockrates is exactly where hardware is going, we just haven't figured out how to get the silicon to work"

no, that's the opposite of where hardware is going. when was the last time flash latency or DRAM latency significantly improved? literally 15 years ago. Optane is the only improvement that has been made on that front and optane is effectively dead at this point. so actually latency and random IO is going backwards right now.

hardware is going in the direction of sustained block transfers - that is where NVMe is still improving today.

issuing random requests still basically sucks just as much as it did 15 years ago, and doing a lot of them in parallel is just a shitty bandaid patch on the problem. some workloads are irreducibly single-threaded, you simply can't proceed on the logic until you know the last bit of data. being able to do lots of those in parallel is nice, but it's the consolation prize on latency no longer scaling anymore.

so, stop doing random IO and just stream your workset in large blocks and have listeners pick off their bit (retrieve individual records, or perform their aggregations) as it streams by. Effectively, do your disk IO in big sustained transfers and then do the random IO in memory.

Not that memory has improved over the last 15 years either but it's better than doing it on disk.

SOLR is behind most of the big webscale search and commerce systems nowadays. Nobody is doing Amazon on RDBMS-style random IO systems, not even on database-style document systems like mongodb (which is really more pseudo-RDBMS than a true document search system).

Or on the flipside: if you want to do random IO on your SSDs, do random IO on your SSDs - forget the whole filesystem layer, and use a key-value SSD (which do exist). That's what your database provides right now, after all. But RDBMS (or, again, quasi-RDBMS random-IO document stores like Mongo) doesn't play to the strength of SSDs anymore, that's not where they're getting better, and if you want to treat it as a block store then you might as well stream big blocks and not do random IO on your device layer.

https://www.mydistributed.systems/2020/07/towards-building-h...

j / k navigate · click thread line to collapse