undefined | Better HN

0 pointshyc_symas12y ago0 comments

In our own workloads, writers are always going after the same pages in their index updates, which inevitably led to deadlocks in BerkeleyDB. As a result, we get much higher throughput with fully serialized writers than with "concurrent" writers. A microbench might show greater concurrency on simple write tasks, but in a real live system with elaborate schema, there's no payoff for us.

As always, you have to profile your workload and see where the delays and bottlenecks really are. Taking a single mutex instead of continuously locking/unlocking all over the place was a win for us.

0 comments

rossjudson12y ago

Is this the reason for your observation that LMDB is oriented towards read workloads?

I can see how the extra code locking/concurrency code would expand the library size out of the CPU cache, though.

hyc_symasOP12y ago

Yes, since readers don't require any locks at all and don't issue any blocking calls of any kind - syscalls, malloc, whatever - they run completely unimpeded. The moment you introduce fine-grained locks of any kind the overall performance (reads and writes) will decrease by at least an order of magnitude because readers will have to deal with lock contention.

j / k navigate · click thread line to collapse

0 comments

rossjudson12y ago

Is this the reason for your observation that LMDB is oriented towards read workloads?

I can see how the extra code locking/concurrency code would expand the library size out of the CPU cache, though.

hyc_symasOP12y ago

j / k navigate · click thread line to collapse