On monday I added some slightly more proper benchmark code, you can find it on https://github.com/spotify/sparkey/blob/master/src/bench.c
I didn't add the level-db code to this benchmark however, since I 1) didn't want to manage that dependency 2) didn't know how to write optimized code for usage of it.
I'm using very small records, a couple of bytes of key and value. The insert order is strictly increasing (key_0, key_1, ...), though that doesn't really matter for sparkey since it uses a hash for lookup instead of ordered lists or trees.
As for the symas mdb microbench, I only looked at it briefly but it seems like it's not actually reading the value it's fetching, only doing the lookup of where it actually is, is that correct?
"MDB's zero-memcpy reads mean its read rate is essentially independent of the size of the data items being fetched; it is only affected by the total number of keys in the database."
Doing a lookup and not using the values seems like a very unrealistic usecase.
Here's the part of the benchmark I'm referring to: for (int i = 0; i < reads_; i++) { const int k = rand_.Next() % reads_; key.mv_size = snprintf(ckey, sizeof(ckey), "%016d", k); mdb_cursor_get(cursor, &key, &data, MDB_SET); FinishedSingleOp(); }