My Year of Riak (opens in new tab)

(inakanetworks.com)

92 pointsinaka14y ago19 comments

19 comments

The killer feature I'm looking for is Riak's "it just works" -- especially in the case of nodes failing, soft failing, going offline, timing out, whatever.

In my situation, I don't care about the performance at all, because I don't have many keys at any given moment. The few that I have matter greatly.

I care that when I store a key, it's reliably, durably, stored and replicated, and that when nodes fail I don't have to do anything special to keep running. (This is in contrast to PostgreSQL, MySQL, or Mongo replication, where you have to fail over, then switch back eventually, and it takes special effort.)

AFAICT, It's not provided by Redis or CouchDB either, because their replication is async -- keys can get lost.

Having looked at a bunch of options in the last couple weeks, it seems like only Riak and Cassandra truly offer durable, synced replication that isn't difficult to admin. (...and of the two of them, Riak's documentation gives much more confidence about the ongoing admin efforts.)

Has anyone used any solid options I've perhaps overlooked?

btilly14y ago

Be warned, shortly after nodes join/leave, Riak has a real possibility of rearranging which nodes will respond to requests for which keys without making sure that those nodes actually have those keys. The result is that key/value pairs can become inaccessible for some time until data migrates under the hood.

This is unlikely to be a problem in practice. But it is a possibility to be aware of.

bretthoerner14y ago

This changed a lot recently in master. I believe they wait until the new node has all of the data until sending it requests. I'm sure it will be part of 1.0.

1 more reply

rbranson14y ago

If performance isn't your absolute highest priority and you can easily fail IPs between hosts with ARP packets, I'd check out an RDBMS warm slave setup using Pacemaker and DRBD. Pacemaker+Heartbeat works stupid-well for MySQL & haproxy failover on Linode. DRBD supports synchronous replication and has all kinds of tuning options to get it perfect for any environmental craziness.

mathias_10gen14y ago

I don't want to turn this into a MongoDB vs Riak thread, but you may want to take another look at Mongo as the points you've mentioned are now handled. As of 1.6 we support automatic fail-over[1] and synchronous replication[2]. In 1.8 we added a journal for durability[3] which will be enabled by default in 2.0 (due out this month - rc2 released today). Optional automatic fail-back[4], for when your preferred primary (if any) comes back online, is also coming in 2.0.

[1] http://www.mongodb.org/display/DOCS/Why+Replica+Sets

[2] http://www.mongodb.org/display/DOCS/Verifying+Propagation+of...

[3] http://www.mongodb.org/display/DOCS/Journaling

[4] http://www.mongodb.org/display/DOCS/Replica+Sets+-+Priority

willbmoss14y ago

Mathias, what Riak (or other distributed databases provide) that Mongo doesn't is "just-works" scaling characteristics beyond one machine.

* If I want to ensure that my data is written to three machines, all my writes will stop working in MongoDB if one machine goes down. With Riak, it will start issuing the writes to another node in the cluster and rebalance when the missing node comes back online.

* If I'm running MongoDB in a sharded configuration, if one of the shards cannot be reached, all writes will stop. With Riak, any node will accept the writes and, once the network issues are resolved, move them to the appropriate node.

That said, conflict resolution is hard and there's no real way to get around it when you're using a distributed database like Riak. With Riak, as Chad says you get "increased development complexity for massively decreased deployment complexity." There's no silver bullet, it's important to look at the trade-offs of each option.

1 more reply

devongall14y ago

Definitely agree on the expense of a list-keys operation, be sure to avoid at all costs.

Some of the Riak documentation was incomplete/incorrect which made implementation a little sticky, but the mailing list is extremely responsive and helpful.

Otherwise, have had a great experience with Riak thus far. Looking forward to the ease of scaling as well!

willbmoss14y ago

It's worth noting that not only is the list keys operation expensive, but since it uses Bloom filters, it's not guaranteed to returns all keys.

My sources at Basho tell me that this is fixed in 1.0, but until that's officially released, basically don't try to list keys.

seancribbs14y ago

Yes, the problem was not necessarily using a Bloom filter, but that it was too small. However, 1.0 is smarter about which vnodes it sends list-keys requests to and thus obviates the need for the Bloom filter (at least for that operation).

spahl14y ago

Another operation to avoid is map/reduce over a whole bucket (so called "bucket scans"). It is extremely heavy and can often be replaced with a more clever schema.

lobster_johnson14y ago

Yes, one thing that is not apparent from the documentation is that a "bucket" is merely a namespacing thing: Internally, all data is stored in one big bucket, with bucket names prefixing keys.

This means that if you have buckets A (1 million items) and B (5 items), sequentially scanning bucket B will take just as long as scanning bucket A -- because Riak has to scan through the entire store. In other words, it's not enough to say that one should avoid scanning a bucket because it's slow when you have lots of stuff in a bucket; it's always too slow to be practically usable in any situation where you have more than a few hundred keys.

I think calling buckets is a big mistake because they create the expectation that they really are separate things. "Namespace" or "keyspace" would have been more appropriate. (Can buckets have different replication semantics? If not, that's even worse.)

Cassandra is loosely based on the same tech as Riak, and supports sequential range queries very well, I hear.

metabrew14y ago

I'd love to use riak for all the reasons mentioned in this article, and more.

The single missing 'feature' (design decision) that I can't live without, is that you can't efficiently do range queries/order-by on the key in riak today.

Hopefully this will get easier with secondary indexes / riak-search integration. Not clear yet.

samg_14y ago

It's so important to evaluate whether you need range queries before picking a tech like Riak.

They are coming, though. That's what I hear, at least.

rzezeski14y ago

If we are talking about performing range queries on an index then Riak already has it in the form of Riak Search. In 1.0 this is also supported by secondary indices.

If we are talking about performing a range operation on the primary key which returns the matching objects, then no, Riak doesn't currently offer that. However, given it's support for an ordered data store such as leveldb in 1.0 it should only be a matter of time before that is possible.

Just to try it out I already implemented this for fun on my fork.

https://github.com/rzezeski/riak_kv/tree/native-range

https://github.com/rzezeski/riak-erlang-client/tree/native-r...

1 more reply

dsl14y ago

I love Riak. The one minor complaint that I have is having a small development/test cluster is pretty painful. If you have <N nodes, you end up with duplicated data in memory and on disk. Sucks for developers who would like a local instance to test with on their virtualized dev boxes.

siculars14y ago

I change n to 1 in the default_bucket_props[0] when I just want a quick instance to test against.

[0] http://wiki.basho.com/Configuration-Files.html#app.config

j / k navigate · click thread line to collapse

19 comments

rarrrrrr14y ago

The killer feature I'm looking for is Riak's "it just works" -- especially in the case of nodes failing, soft failing, going offline, timing out, whatever.

In my situation, I don't care about the performance at all, because I don't have many keys at any given moment. The few that I have matter greatly.

AFAICT, It's not provided by Redis or CouchDB either, because their replication is async -- keys can get lost.

Has anyone used any solid options I've perhaps overlooked?

btilly14y ago

This is unlikely to be a problem in practice. But it is a possibility to be aware of.

bretthoerner14y ago

This changed a lot recently in master. I believe they wait until the new node has all of the data until sending it requests. I'm sure it will be part of 1.0.

1 more reply

rbranson14y ago

mathias_10gen14y ago

[1] http://www.mongodb.org/display/DOCS/Why+Replica+Sets

[2] http://www.mongodb.org/display/DOCS/Verifying+Propagation+of...

[3] http://www.mongodb.org/display/DOCS/Journaling

[4] http://www.mongodb.org/display/DOCS/Replica+Sets+-+Priority

willbmoss14y ago

Mathias, what Riak (or other distributed databases provide) that Mongo doesn't is "just-works" scaling characteristics beyond one machine.

1 more reply

devongall14y ago

Definitely agree on the expense of a list-keys operation, be sure to avoid at all costs.

Some of the Riak documentation was incomplete/incorrect which made implementation a little sticky, but the mailing list is extremely responsive and helpful.

Otherwise, have had a great experience with Riak thus far. Looking forward to the ease of scaling as well!

willbmoss14y ago

It's worth noting that not only is the list keys operation expensive, but since it uses Bloom filters, it's not guaranteed to returns all keys.

My sources at Basho tell me that this is fixed in 1.0, but until that's officially released, basically don't try to list keys.

seancribbs14y ago

spahl14y ago

Another operation to avoid is map/reduce over a whole bucket (so called "bucket scans"). It is extremely heavy and can often be replaced with a more clever schema.

lobster_johnson14y ago

Yes, one thing that is not apparent from the documentation is that a "bucket" is merely a namespacing thing: Internally, all data is stored in one big bucket, with bucket names prefixing keys.

Cassandra is loosely based on the same tech as Riak, and supports sequential range queries very well, I hear.

metabrew14y ago

I'd love to use riak for all the reasons mentioned in this article, and more.

The single missing 'feature' (design decision) that I can't live without, is that you can't efficiently do range queries/order-by on the key in riak today.

Hopefully this will get easier with secondary indexes / riak-search integration. Not clear yet.

samg_14y ago

It's so important to evaluate whether you need range queries before picking a tech like Riak.

They are coming, though. That's what I hear, at least.

rzezeski14y ago

If we are talking about performing range queries on an index then Riak already has it in the form of Riak Search. In 1.0 this is also supported by secondary indices.

Just to try it out I already implemented this for fun on my fork.

https://github.com/rzezeski/riak_kv/tree/native-range

https://github.com/rzezeski/riak-erlang-client/tree/native-r...

1 more reply

dsl14y ago

siculars14y ago

I change n to 1 in the default_bucket_props[0] when I just want a quick instance to test against.

[0] http://wiki.basho.com/Configuration-Files.html#app.config

j / k navigate · click thread line to collapse