Principles of Sharding for Relational Databases (opens in new tab)

(citusdata.com)

292 pointstikhon8y ago47 comments

47 comments

I find the "you don't want to shard" camp quite annoying. Of course, I don't want to shard! Who does?! It adds complexity, both implementation-wise and operational.

But if you got 5 TB of data, that needs to be in a SSD drive, then please tell me how I can get that into 1 single physical database.

jchanimal8y ago

There is a new generation of relational databases that are native to multi-node operation, and don't require sharding. I'm speaking of tech like Google Spanner and my employer, FaunaDB.

Now you don't have to shard. More info on how we accomplish distributed transactions. https://fauna.com/blog/distributed-consistency-at-scale-span...

brianwawok8y ago

P.S. Google Spanner and FaunaDB both shard. They can call it something else. But unless every node has all data on it, it is sharded.

1 more reply

berns8y ago

The title tag of Spanner home page reads: Cloud Spanner | Automatic Sharding with Transactional Consistency at Scale

dboreham8y ago

_You_ don't have to, but someone does.

1 more reply

sigstoat8y ago

http://www.fixstars.com/en/ssd/ ?

if all you need is a lot of data in a single database, there's basically nothing except for money between you and your goal. JBODs full of SSDs coming into a single machine via SAS will get you into petabytes, just with commodity hardware you can order from amazon.

i'm expect IBM could sell you a mainframe that'll do it for whatever capacity you care to name.

user59944618y ago

The thing is that 5TB of company data cannot reasonably be kept in JBOD on the cheapest drives you could find on Amazon.

2 more replies

zbobet20128y ago

You can _easily_ buy a box with 60+TB of SSD...

http://www.dell.com/en-us/work/shop/povw/poweredge-r930

Some of us do need to shard for sure though (I have multi petabyte data sets).

kakwa_8y ago

Storing 60+TB of data is different than searching and doing complex computation on 60TB of data.

Also, operations on a such huge data set can be really painful. Think how to backup a DB like that safely, or how to update the engine.

Some slides (little old, 2014) about a huge postgres instance serving as a backend for leboncoin.fr (main classified advertising website in France).

https://fr.slideshare.net/jlb666/pgday-fr-2014-presentation-...

Basically, they bought the best hardware money could buy at the time to scale vertically, they, in the end, run in some issues and started thinking about sharding this huge DB.

1 more reply

morgo8y ago

I'm the original author of the linked post you refer to.

Perhaps the title is click bait, but at the time I was meeting with a lot of users looking for someone else's problems.

5TB could still easily be single server territory. It depends more on the queries.

My point is just that some workloads are better solved with (some) vertical scaling first.

grw_8y ago

Sorry if this is obvious, but why not use multiple drives?

jajern8y ago

Technically you can. You can run a SSD array in RAID5, 6 or 10, and lose a little bit of performance but get the capacity needed.

metaphorm8y ago

because that is nearly a synonym for sharding?

or did you mean just use a large capacity RAID setup? that will probably work fine for a lot of situations but it can expensive and introduce more latency for certain types of operations (but that might not matter, depends on context).

1 more reply

PaulHoule8y ago

You get one of these

https://petapixel.com/2015/08/15/samsung-16tb-ssd-is-the-wor...

kabdib8y ago

Um, we run many databases of 20-30TB, some well over 100TB. We use SQL Server, and it just allocates more files. It's not zero touch, but with the right storage technology it's not bad, either.

qaq8y ago

did you loose a 0 somewhere? Even something crappy like RDS will be able to help you with 5TB database. you could do more than 10 on i3.16xlarge or if you are ok running on own/leased hardware you could 20-30 for PG or 100+ for commercial DBs.

gnaritas8y ago

> But if you got 5 TB of data, that needs to be in a SSD drive, then please tell me how I can get that into 1 single physical database.

Drive capacity in a server is not limited to the size of a single drive. You can build a raid array any size you like by simply adding more drives.

ozgune8y ago

Hey everyone, it's Ozgun. When I first wrote this blog post, it was much longer. Based on initial feedback, I edited out parts of it to keep the post focused.

If you have any questions that aren't covered in the post, happy to answer them here!

IpV88y ago

Thats funny, because I was dying for it to be longer. I felt like the post was just an introduction. I'd love to see a part 2 with a more detailed description that touches more of the implementation of a sharding plan.

For me a major question I have as I consider sharding is what my application code will look like. Let's say I have a query like:

'select products.name from vendor inner join products on vendor.id = products.vendor where vendor.location = "USA"'

If I shard such that there are many products table (1 per vendor), what would my query look like?

ves8y ago

Your application code shouldn't have sharding concerns in its logic. To achieve this, you should introduce an abstraction layer. One such example is vitess[0], which is used at YouTube.

If that's too much work, then an easy preliminary step is to add the abstraction layer in your application code. That gets you most of the benefits of a proxy for the purpose of having clean application logic, and makes it easy to switch over later, but is less powerful and feature complete.

[0]: http://vitess.io/overview/#features

ozgune8y ago

Reading through your comment again, I realize I completely missed the mark on your question.

If you use Citus, you don't have to make any changes in your application. You just need to remodel your data and define your tables' sharding column(s). Citus will take care of the rest. [1]

In other words, your app thinks it's talking to Postgres. Behind the covers, Citus shards the tables, routes and parallelizes queries. Citus also provides transactions, joins, and foreign keys in a distributed environment.

[1] Almost. Over the past two years, we've been adding features to make app integration seamless. With our upcoming release, we'll get there: https://github.com/citusdata/citus/issues/595

ozgune8y ago

Thanks for your input (also the_duke)! If time permits, we may come up with a second blog post on this topic.

If I understood your example query, your application serves vendors and each vendor has different products. Is that correct?

You can approach this sharding question in one of two ways.

1. Merge different product tables into one large product table and add a vendor column

2. Model product tables as "reference tables". This will replicate the product tables to all nodes in the cluster

Without knowing more about your application / table schemas, I'd recommend the first approach. I'd also be happy to chat more if you drop us a line.

the_duke8y ago

Same here.

To me it read like just a basic introductory post to a longer series.

ttt1112223338y ago

> On the benefits side, when you separate your data into groups this way, you can’t rely on your database to join data from different sources or provide transactions and constraints across data groups.

How is it a benefit that you are no longer able to join data in your separate tables? Is this sentence a mistake?

rsolari8y ago

Thanks for writing the post. Sharding is something I’m consudering at my current job.

How long do these sharding projects usually take? Do you know of any posts that break down the steps in more detail?

ozgune8y ago

Timeframes for sharding projects vary quite a bit. If you have a B2B database, we find that sharding projects usually take between one to eight weeks of engineering (not clock) time. Most take two to three weeks.

A good way to tell is by looking at your database schema. If you have a dozen tables, you'll likely migrate with one week's of effort. If your database has 250+ tables, then you'll take about eight weeks.

When you're looking to shard your B2B database, you usually need to take the following steps:

1. Find tables that don't have a customer / tenant column, and add that column. Change primary and foreign key definitions to include this column. (You'll have a few tables that can't have a customer column, and these will be reference tables)

2. Backfill data to tables that don't didn't have customer_id / tenant_id

3. Change your application to talk to this new model. For Rails/Django, we have libraries available that make the app changes simpler (100-150 lines). For example: https://github.com/citusdata/activerecord-multi-tenant

4. Migrate your data over to a distributed database. Fortunately, online data migrations are starting to become possible with logical decoding in Postgres.

If you have a B2C app, these estimates and steps will be different. In particular, you'll need to figure out how many dimensions (columns) are central to your application. From there on, you'll need to separate out the data and shard each data group separately.

1 more reply

dcosson8y ago

Interesting that sharding by customer for a sass business is the example of the best use of sharding. That can also go very wrong - what if you get a huge customer that's as big as everyone else combined? You're effectively maxed out at 2 shards.

Definitely depends on the workload, but often the "micro service" approach (whether or not it's a true micro service in its own runtime) of sharding just one type of data/small set of related tables that you can shard by a primary key or user id or something seems like the only reasonable option for sharding. If your data is becoming unwieldy there's often a bottleneck data set that's bigger than everything else so you don't necessarily have to share everything all at once.

megamindbrian8y ago

I laugh every time I read that word.

dboreham8y ago

Pottery, right?

CamperBob28y ago

The term comes from Ultima Online, one of the first graphical MMORPGs that gained mass-market acceptance. The rationale behind the architecture that Origin used for geographical load balancing was that the independent copy of the game world that resided on each server represented a "shard" of the shattered gem of Mondain the Wizard.

http://www.uoguide.com/Mondain

1 more reply

0xc0018y ago

She shard on a turtle!

j / k navigate · click thread line to collapse

47 comments

AznHisoka8y ago

I find the "you don't want to shard" camp quite annoying. Of course, I don't want to shard! Who does?! It adds complexity, both implementation-wise and operational.

But if you got 5 TB of data, that needs to be in a SSD drive, then please tell me how I can get that into 1 single physical database.

jchanimal8y ago

There is a new generation of relational databases that are native to multi-node operation, and don't require sharding. I'm speaking of tech like Google Spanner and my employer, FaunaDB.

Now you don't have to shard. More info on how we accomplish distributed transactions. https://fauna.com/blog/distributed-consistency-at-scale-span...

brianwawok8y ago

P.S. Google Spanner and FaunaDB both shard. They can call it something else. But unless every node has all data on it, it is sharded.

1 more reply

berns8y ago

The title tag of Spanner home page reads: Cloud Spanner | Automatic Sharding with Transactional Consistency at Scale

dboreham8y ago

_You_ don't have to, but someone does.

1 more reply

sigstoat8y ago

http://www.fixstars.com/en/ssd/ ?

i'm expect IBM could sell you a mainframe that'll do it for whatever capacity you care to name.

user59944618y ago

The thing is that 5TB of company data cannot reasonably be kept in JBOD on the cheapest drives you could find on Amazon.

2 more replies

zbobet20128y ago

You can _easily_ buy a box with 60+TB of SSD...

http://www.dell.com/en-us/work/shop/povw/poweredge-r930

Some of us do need to shard for sure though (I have multi petabyte data sets).

kakwa_8y ago

Storing 60+TB of data is different than searching and doing complex computation on 60TB of data.

Also, operations on a such huge data set can be really painful. Think how to backup a DB like that safely, or how to update the engine.

Some slides (little old, 2014) about a huge postgres instance serving as a backend for leboncoin.fr (main classified advertising website in France).

https://fr.slideshare.net/jlb666/pgday-fr-2014-presentation-...

Basically, they bought the best hardware money could buy at the time to scale vertically, they, in the end, run in some issues and started thinking about sharding this huge DB.

1 more reply

morgo8y ago

I'm the original author of the linked post you refer to.

Perhaps the title is click bait, but at the time I was meeting with a lot of users looking for someone else's problems.

5TB could still easily be single server territory. It depends more on the queries.

My point is just that some workloads are better solved with (some) vertical scaling first.

grw_8y ago

Sorry if this is obvious, but why not use multiple drives?

jajern8y ago

Technically you can. You can run a SSD array in RAID5, 6 or 10, and lose a little bit of performance but get the capacity needed.

metaphorm8y ago

because that is nearly a synonym for sharding?

1 more reply

PaulHoule8y ago

You get one of these

https://petapixel.com/2015/08/15/samsung-16tb-ssd-is-the-wor...

kabdib8y ago

Um, we run many databases of 20-30TB, some well over 100TB. We use SQL Server, and it just allocates more files. It's not zero touch, but with the right storage technology it's not bad, either.

qaq8y ago

gnaritas8y ago

> But if you got 5 TB of data, that needs to be in a SSD drive, then please tell me how I can get that into 1 single physical database.

Drive capacity in a server is not limited to the size of a single drive. You can build a raid array any size you like by simply adding more drives.

ozgune8y ago

Hey everyone, it's Ozgun. When I first wrote this blog post, it was much longer. Based on initial feedback, I edited out parts of it to keep the post focused.

If you have any questions that aren't covered in the post, happy to answer them here!

IpV88y ago

For me a major question I have as I consider sharding is what my application code will look like. Let's say I have a query like:

'select products.name from vendor inner join products on vendor.id = products.vendor where vendor.location = "USA"'

If I shard such that there are many products table (1 per vendor), what would my query look like?

ves8y ago

Your application code shouldn't have sharding concerns in its logic. To achieve this, you should introduce an abstraction layer. One such example is vitess[0], which is used at YouTube.

[0]: http://vitess.io/overview/#features

ozgune8y ago

Reading through your comment again, I realize I completely missed the mark on your question.

If you use Citus, you don't have to make any changes in your application. You just need to remodel your data and define your tables' sharding column(s). Citus will take care of the rest. [1]

[1] Almost. Over the past two years, we've been adding features to make app integration seamless. With our upcoming release, we'll get there: https://github.com/citusdata/citus/issues/595

ozgune8y ago

Thanks for your input (also the_duke)! If time permits, we may come up with a second blog post on this topic.

If I understood your example query, your application serves vendors and each vendor has different products. Is that correct?

You can approach this sharding question in one of two ways.

1. Merge different product tables into one large product table and add a vendor column

2. Model product tables as "reference tables". This will replicate the product tables to all nodes in the cluster

Without knowing more about your application / table schemas, I'd recommend the first approach. I'd also be happy to chat more if you drop us a line.

the_duke8y ago

Same here.

To me it read like just a basic introductory post to a longer series.

ttt1112223338y ago

How is it a benefit that you are no longer able to join data in your separate tables? Is this sentence a mistake?

rsolari8y ago

Thanks for writing the post. Sharding is something I’m consudering at my current job.

How long do these sharding projects usually take? Do you know of any posts that break down the steps in more detail?

ozgune8y ago

When you're looking to shard your B2B database, you usually need to take the following steps:

2. Backfill data to tables that don't didn't have customer_id / tenant_id

4. Migrate your data over to a distributed database. Fortunately, online data migrations are starting to become possible with logical decoding in Postgres.

1 more reply

dcosson8y ago

megamindbrian8y ago

I laugh every time I read that word.

dboreham8y ago

Pottery, right?

CamperBob28y ago

http://www.uoguide.com/Mondain

1 more reply

0xc0018y ago

She shard on a turtle!

j / k navigate · click thread line to collapse