Having read the pg_shard readme, it's not clear to me how it addresses that issue. I'd need to have a really clear idea how to handle scaling my cluster before committing to a sharding solution.
As a summary, the user specifies the shard and replication count as they are sharding their table. For example, if you have 4 nodes, you may pick 256 as the initial shard count. That way, you'll have ample room to grow as you add new nodes to your cluster.
When you pick 256 shards over 4 worker nodes, pg_shard will create them in a round-robin fashion. Shard #1 will go to node A and B, shard #2 will go to node B and C, and so forth. This has the advantage that when one of the worker nodes fail, the remaining 3 nodes evenly take the additional work. Also, when you add a new node to the cluster, you can gradually rebalance some of the shards by moving them to the new node.
We implemented application level sharding a few years ago when we had 1Tb of data. We now have over 20Tb of data and application level business logic is killing us.
Our pain is not DB speed, but application complexity. All of our applications (backup, web, maintenance, archiving etc) all need the business logic to traverse the sharding set.
We are planning to move to DB level sharding in January (as part of a larger application and infrastructure refresh).
I've always been a fan of using PostgreSQL wherever possible, and this extends "where possible". As a YC S11 batchmate especially, I'm really proud of all the great contributions Citus Data has made, and what a useful tool CitusDB is (a bunch of other YC companies use it).
There is definitely a different approach being taken as Postgres-XL has a supervisor/loadbalancer, and pg_shard seems like every node is capable of doing all actions.
Excited to see it evolve.
Yes: if you have high availability needs, PostgreSQL's streaming replication feature can mitigate this failure risk by providing a hot standby for your master; otherwise, regular backups of the master can suffice (pg_shard's metadata is stored in regular PostgreSQL tables).
See our documentation page for more answers to questions like this: http://citusdata.com/docs/pg-shard
http://www.databasesoup.com/2014/12/whats-this-pgshard-thing...
It explains a bit more what it does and doesn't
Another difference with Postgres-XL is that pg_shard is an extension to PostgreSQL, rather than a separate product with different components. As such, you can just compile and load it into standard Postgres, and you're ready to go.
At the moment, our customers who do need table alterations accomplish them by using a script that propagates such changes to all worker nodes.
One of the issues I can see already is being able to support existing applications, especially ones that have transaction heavy workflows. I have the same issue with Postgres XC, supporting transactions, but not supporting savepoints.
But this looks like a completely different use case for postgres, as a sort of pseudo-noSQL type db.
Any advice on migration process? Transferring a high write-throughput postgres instance to a multi-pg deployment with pg_shard feels pretty daunting.