This might explain some of the "why yet another RocksDB-based KV store?" line of questioning.
Aaah, there was a super informative talk about the different databases at Facebook, most of them built on RocksDB, with different trade offs. (And I can't find the video :((((( )
Anyway, it makes sense to have yet another it if serves a different purpose. Eg. for read-heavy workloads (caches, serving user feeds, whatever), or write-heavy (monitoring, storing that sweet sweet tracking juice that then gets read once or twice while building the recommendation models), small or large blobs, latency requirements, HA/consistency requirements, how complicated queries are going to be, does it support secondary indices or not, etc.
Source: https://www.zhihu.com/question/66719537/answer/245270169 (in Chinese)
Kafka, Cassandra, Zookeeper, Spark, Tomcat, Superset, Storm, Lucene,Log4j2, Hadoop, etc. The list goes on, but I would safely say that a majority of the world's systems run on Apache projects which are for the most part actively developed
There's a lot of KV engines that uses RocksDB now, like CockroachDB (Forked into PebbleDB though), YugabyteDB and TiDB.
Those are all many times slower than Redis though, so having a middle-ground aimed to be similar to Redis, that doesn't eat all RAM, is very exciting!
This might seem like a bit of a facetious comment, but it does have genuine use cases, e.g. unit tests, or for data that you can easily restore after a shutdown.
I'd appreciate if there any links/doc that I could look into to learn more about this?
https://github.com/apache/incubator-pegasus-website/tree/mas...
Is the cognitive load this produces still worth the consideration? At what level do you have to operate for the gains to actually make viable business sense to even consider?
Sometimes I am thinking "Well, surely at Google level" – and then I load up one of their interfaces, for example Google Ads, and I have to sit around for 10s before anything even shows up.