undefined | Better HN

0 pointseloff11y ago0 comments

It's that merge (edit: GC is a better term) step that's difficult to get right. Google screwed this up badly with LevelDB which had(still has?) horrible performance issues caused by compaction. Even with concurrent compaction it can be difficult due to needing additional disk space, adding additional read and write pressure to the storage subsystem and the effects that has on latency. I'm not sure what RethinkDB's approach was there, but I'm very curious to know.

0 comments

coffeemug11y ago

> Even with concurrent compaction it can be difficult due to needing additional disk space, adding additional read and write pressure to the storage subsystem and the effects that has on latency. I'm not sure what RethinkDB's approach was there, but I'm very curious to know.

Yes, we ran into all these issues with the RethinkDB storage engine. Unfortunately I can't summarize the solution, because there are no silver bullets. It took a long time to perfect the engine, and there was enormous amount of tuning work to get everything right.

For example, we have a "young blocks" subsystem that treats recently updated blocks differently (since, empirically, recently written blocks are dramatically more likely to be updated again, so we hold off on trying to collect them). How long should you wait? How many young blocks should you consider?

Working out solutions to these questions takes a lot of trial and error, and that's where the bulk of the work is (and that's just one subsystem!)

I'd love to write about it in depth, I'll try to make it a priority.

pas11y ago

How much similarity is there between the JVM's G1GC, CMS or other collectors and Rethink's compaction? It looks like the heuristics and trade-offs are very much the same. (Latency, hard space constraint, usage patterns.) Okay, you don't have to do pointer/object graph chasing, but queries and consistency and whatnot has to do something similar.

coffeemug11y ago

Remember -- this is an on-disk compactor, so it's not quite the same as collecting garbage in memory. There are other differences -- database dependencies are typically trees (or in the worst case DAGs, as there aren't any circular references). So our compactor can be much simpler (in fact, it's closer to a purely functional collector like Haskell's).

But overall it's very similar to a programming language GC. The devil, as usual, is in the details.

2 more replies

eloffOP11y ago

If you do, please ping me at dan.eloff @ populargooglemailservice.com, I don't want to miss reading that one!

j / k navigate · click thread line to collapse

0 comments

coffeemug11y ago

Working out solutions to these questions takes a lot of trial and error, and that's where the bulk of the work is (and that's just one subsystem!)

I'd love to write about it in depth, I'll try to make it a priority.

pas11y ago

coffeemug11y ago

But overall it's very similar to a programming language GC. The devil, as usual, is in the details.

2 more replies

eloffOP11y ago

If you do, please ping me at dan.eloff @ populargooglemailservice.com, I don't want to miss reading that one!

j / k navigate · click thread line to collapse