CopyCat: Protocol-agnostic implementation of the Raft consensus algorithm (opens in new tab)

(github.com)

61 pointskuujo11y ago15 comments

15 comments

I am writing a program which has to durably store large number of small files. My current implementation stores these on the local disk as files in a particular folder structure. I am considering designing some approach to make this program run on a cluster of machines for high-availability, fault-tolerance & scalability reasons.

One approach I can think of is to run a distributed document database and use that for storage. I don't need most features of such database products in the foreseeable future, so I fear that they will add operational overhead for not much benefit.

Another approach I can think of is to run my processing nodes against a network file system, and rely on that to do the replication.

Yet another approach I am considering is to use something like CopyCat to implement the file replication in my application code. Is this a good use case for CopyCat?

ww52011y ago

CopyCat looks really good for HA and fault-tolerance. Not so sure about its scalability since all writes go through the single leader node. It's more appropriate for the use of maintaining metadata of a distributed system, rather than maintaining the data themselves.

For your requirement, can you use S3 or something similar?

mavelikara11y ago

Thanks! The software I am writing needs to be (easily) installed at the customer site. So, S3 won't work for me. Are there any open source S3 clones that are easy to setup?

3 more replies

SEJeff11y ago

That is a "design feature" of raft, which allows it to lack the unbelievable complexity of Paxos, Multi-Paxos, or something like ZAB (yet another paxos variant).

lmm11y ago

Sounds like the kind of problem OrientDB is supposed to solve (I haven't tried it though). I would stay away from network filesystems (they're fiddly, they'll add much more operational overhead than a database product, and there is no mature distributed one. OpenAFS is probably your best bet if you do want to go that route).

mavelikara11y ago

OrientDBs clustering is, IIRC, built on top of Hazelcast. Storing my files in a clustered Hazelcast is another option I am considering, although I forgot to include it in my first comment.

1 more reply

enigmo11y ago

Have you considered using a pre-existing distributed filesystem like Ceph or even HDFS instead?

mavelikara11y ago

I did consider both. HDFS, from what I read, is designed for storing large files - my files are about 10Kb in size each.

I am aware of Ceph but have not tried installing it to see how easy/hard it is to setup. Also, although this is not a hard requirement, I'd like to be able to support Windows; from what I have read so far, Ceph does not support Windows.

2 more replies

michaelmior11y ago

This looks fantastic! As a DB researcher, I've found in the past I need some type of consensus algorithm for a system I'm building, but I don't want to spend a lot of time implementing something myself. I can see this being very useful in academia for scenarios like this.

j / k navigate · click thread line to collapse

15 comments

mavelikara11y ago

Another approach I can think of is to run my processing nodes against a network file system, and rely on that to do the replication.

Yet another approach I am considering is to use something like CopyCat to implement the file replication in my application code. Is this a good use case for CopyCat?

ww52011y ago

For your requirement, can you use S3 or something similar?

mavelikara11y ago

Thanks! The software I am writing needs to be (easily) installed at the customer site. So, S3 won't work for me. Are there any open source S3 clones that are easy to setup?

3 more replies

SEJeff11y ago

That is a "design feature" of raft, which allows it to lack the unbelievable complexity of Paxos, Multi-Paxos, or something like ZAB (yet another paxos variant).

lmm11y ago

mavelikara11y ago

OrientDBs clustering is, IIRC, built on top of Hazelcast. Storing my files in a clustered Hazelcast is another option I am considering, although I forgot to include it in my first comment.

1 more reply

enigmo11y ago

Have you considered using a pre-existing distributed filesystem like Ceph or even HDFS instead?

mavelikara11y ago

I did consider both. HDFS, from what I read, is designed for storing large files - my files are about 10Kb in size each.

2 more replies

michaelmior11y ago

j / k navigate · click thread line to collapse