Notes on Theory of Distributed Systems [pdf] (opens in new tab)

(cs.yale.edu)

199 pointshtfy963y ago45 comments

45 comments

Is anyone teaching "Practice of boring Distributed Systems 101 for dummies on a budget with a tight schedule" ?

As in, "we have a PHP monolith used by all of 12 people in the accounting department, and for some reason we've been tasked with making it run on multiple machines ("for redundancy" or something) by next month.

The original developpers left to start a Bitcoin scam.

Some exec read about the "cloud", but we'll probably get just enough budget to buy a coffee to an AWS salesman.

Don't even dream of hiring a "DevOps" to deploy a kubernetes cluster to orchestrate anything. Don't dream of hiring anyone, actually. Or, paying anything, for that matter.

You had one machine ; here is a second machine. That's a 100% increase in your budget, now go get us some value with that !

And don't come back in three months to ask for another budget to 'upgrade'."

Where would someone start ?

(EDIT: To clarify, this is a tongue in cheek hyperbole scenario, not a cry for immediate help. Thanks to all who offered help ;)

Yet, I'm curious about any resource on how to attack such problems, because I can only find material on how to handle large scale multi million users high availability stuff.)

keule3y ago

> As in, "we have a PHP monolith used by all of 12 people in the accounting department, and for some reason we've been tasked with making it run on multiple machines ("for redundancy" or something) by next month.

Usually, your monolith has these components: a web server (apache/nginx + php), a database, and other custom tooling.

> Where would someone start ?

I think a first step is to move the database to something managed, like AWS RDS or Azure Managed Databases. Herein lies the basis for scaling out your web tier later. And here you will find the most pain because there are likely: custom backup scripts, cron jobs, and other tools that access the DB in unforeseen ways.

If you get over that hump you have done your first big step towards a more robust model. Your DB will have automated backups, managed updates, rollover, read replicas etc. You may or may not see a performance increase, because you effectively split your workload across two machines.

_THEN_ you can front your web tier with a load balancer, i.e. you load balance to one machine. This gives you: better networking, custom error pages, support for sticky sessions (you likely need them later), and better/more monitoring.

From thereon you can start working on removing those custom scripts of the web tier machine and start splitting this into an _actual_ load-balanced infrastructure, going to two web-tier machines, where traffic is routed using sticky-sessions.

Depending on the application design you can start introducing containers.

Now, this approach will not give you a _cloud-native awesome microservice architecture_ with CI/CD and devops. But it will be enough to have higher availability and more robust handling of the (predictable) load in the near future. And on the way, you will remove bad patterns that eventually allow you to go to a better approach.

I would be interested in hearing if more people face this challenge. I don't know if guides exist around this on the webs.

smeagull3y ago

I certainly agree about the cron jobs. We shifted a whole bunch of tooling to an internal PaaS solution. One of the tools (a Kanban board I think) we shifted had started sending alerts for jobs that they had since deleted - upon investigation it was the cron job and database that still existed on the old server, still sending out emails.

keule3y ago

Thank you for sharing the unforeseen depths of a monolith :D

thebeastie3y ago

Hey, this was useful, thanks.

throwaway7875443y ago

If someone would pay for it I'd write that book. There are lots of different methods for different scenarios. There are some books on it but they're either very dry and technical or have very few examples.

Here's the cliffs notes version for your situation:

1. Build a server. Make an image/snapshot of it.

2. Build a second server from the snapshot.

3. Use rsync to copy files your PHP app writes from one machine ('primary') to another ('secondary').

4. To make a "safe" change, change the secondary server, test it.

5. To "deploy" the change, snapshot the secondary, build a new third server, stop writes on the primary, sync over the files to the third server one last time, point the primary hostname at the third server IP, test this new primary server, destroy the old primary server.

6. If you ever need to "roll back" a change, you can do that while there's still three servers up (blue/green), or deploy a new server with the last working snapshot.

7. Set up PagerDuty to wake you up if the primary dies. When it does, change the hostname of the first box to point to the IP of the second box.

That's just one way that is very simple. It is a redundant active/passive distributed system with redundant storage and immutable blue/green deployments. It can be considered high-availability although that term is somewhat loaded; ideally you'd make as much of the system HA as possible, such as independent network connections to the backbone, independent power drops, UPC, etc (both for bare-metal and VMs).

You can get much more complicated but that's good enough for what they want (redundancy) and it buys you a lot of other benefits.

jiggawatts3y ago

I would not do this for a web app.

Having said that, I have done something very similar for large pools of terminal services session hosts. (Think of a Windows box with a special license that allows multiple remote connected desktop users, and 100 pre-installed GUI applications.)

For web apps, you almost always want either of the following:

- A central file share or NFS mount of some sort, with the servers mounting it directly. Ideally with a local cache that can tolerate file server outages and continue in read-only mode. These days I use zone-redundant Azure File Shares for that. They're fully managed and scale to crazy levels. On a small scale they're so cheap that they're practically free, but have the same high availability as a cluster of file servers in multiple data centres! This is a good approach if your web app writes files locally in normal operation. If you need to distribute an app like this without rewriting that aspect, a central file share is the easy way.

- An automated deployment from something like Azure DevOps pipelines or GitHub Actions that builds VMs one at a time. Both are free in most small-scale scenarios. (For PHP, deployment is just a file copy, so a bash script triggered from a management box is sufficient!) The problem with the "sync stuff around" approach is that corruption gets copied around too. Small one-time mistakes become "sticky" and never undo themselves. Junk files accumulate, eventually causing problems. This method solves that.

Additionally, in all modern clouds you can run "plain" virtual machines in scale sets, where the instances can be scaled out. The scaling part is actually not so important! The key bit is that this will force you to fully automate the VM deployment process, including base OS image updates. Rolling upgrades become easy. Similarly, you can undo the damage done by a malware attack by simply scaling to zero, and then scaling back up. This approach is totally stateless, so you don't need to worry about backing up the VMs. Just rebuild on demand.

But all of that is just a lot of manual labour. It's much easier to host simple apps on a managed platform like Azure App Service, which takes care of all of this. The low-end tiers are cheaper than a pair of VMs.

mindcrime3y ago

Is anyone teaching "Practice of boring Distributed Systems 101 for dummies on a budget with a tight schedule" ?

I have to admit, there's something about this comment that makes me sad in a way. Not to say that there's anything inherently wrong with this question or to say that I disagree with you exactly. It's just that I like the idea of computing / hacking being centered more around a mindset of limitless possibilities, exploration, questioning the boundaries of what can be done (as opposed to what should be done?), and not something that's caught up in drudgery like budgets, schedules, and "business stuff."

Sorry, guess I'm just feeling nostalgic for a minute or something (maybe because I've been watching that 5 hour long interview Lex Fridman did with Carmack) and am flashing back to what computing was to me when I first got involved. Back in those days, a paper / book like this would have evoked a "WOW, HOW F%#@NG COOL IS THIS!??!!????" reaction from me. And I guess it still kinda does in a weird sort of way, even though I also have to deal with budgets, schedules, and the drudgery of the business world. sigh

suprgeek3y ago

The AWS scaling series https://aws.amazon.com/blogs/startups/scaling-on-aws-part-1-... gives you a very nice primer on this exact situation. You may or may not opt to use AWS - the fundamental concepts would translate well over other cloud providers or even on premises.

Then you can follow along parts 2, 3 &4 to scale up by factors of ~10 or more -

https://aws.amazon.com/blogs/startups/scaling-on-aws-part-2-... https://aws.amazon.com/blogs/startups/scaling-on-aws-part-3-...

fredsmith2193y ago

I can’t believe at 12 people would actually be stressing the system. Could you meet the requirements of the project by setting up the second machine as a hot back up at an offsite location?

phtrivier3y ago

Maybe. How do I find the O'Reilly book that explains that ? And the petty details about knowing the first one is down and starting the backup ? And just enough data replication to actually have some data in the second machine ? Etc, etc...

My pet peeves with distributed and ops books is that they usually start by laying out all those problems, but then move on to either :

- explain how Big Tech has even bigger problems, before explainig how you can fix Big Tech problems with Big Tech budgets and headcound by deploying just one more layer of distributed cache or queue that vietually ensures your app is never going to work again (That's "Desifning Data Intensive Applications", in bad faith.)

- or, not really explain anything, wave their hands chanting "trade offs trade offs" and start telling kids stories about Byzantine Generals.

lmwnshn3y ago

More entertainment than how-to guide, and oriented more towards developers than ops, but if you haven't read "Scalability! But at what COST?" [0], I think you'll enjoy it.

[0] https://www.frankmcsherry.org/graph/scalability/cost/2015/01...

arinlen3y ago

> explain how Big Tech has even bigger problems, before explainig how you can fix Big Tech problems with Big Tech budgets and headcound (...)

What do you have to say about the fact that the career goals of those interested in this sort of topic is... Be counted as part of the headcount of these Big Tech companies while getting paid Big Tech budget salaries?

Because if you count yourself among those interested in the topic, that's precisely the type of stuff you're eager to learn so that you're in a better position to address those problems.

What's your answer to that? Continue writing "hello world" services with Spring Initializr because that's all you need?

2 more replies

EddySchauHai3y ago

What you’re describing there sounds like general Linux sysadmin to me?

1 more reply

edgyquant3y ago

I’m a bit crazy but taking an old monolith and slowly pruning and refining it into a codebase that would make a mathematician blush is soothing to me.

Like a bonsai tree. There’s a point where you’ve written enough helpers (complete with tests) and abstracted away logic from the views when you suddenly are able to rapidly refactor all of the crap that’s left and when you’re done the resulting codebase can be easily distributed or scaled.

So I’d start by just breaking the data away from logic and then break that data away from the database with the idea being to use a redis server as your apps data model which you can call some function to sync to the database from time to time.

Then build an event logger that encompasses everything (every interaction at least) that happens on the front end (this is trivial with JavaScript on events.)

then spin up two nodes of it and write some function that merges two of these event trees (sorting by timestamp + pick a bias for when two events happen at the same time.)

It won’t scale to 1000 users, and you’ll find kinks to work out along the way. But this is a good start

slt20213y ago

distributed systems are usually for millions of users, not 12 users.

for your problem you can start by configuring nginx to work as load balancer and spin up 2nd VM with php app

phtrivier3y ago

"But what if _the_ machine goes down ? What if it goes down _during quarter earnings legally requested reporting consolidation period_ ? We need _redundancy_ !!"

Also, philosophically, I guess, a "distributed" systems starts at "two machines". (And you actually get most of the "fun" of distributed systems with "two processes on the same machine".)

We're taught how to deal with "N=1" in school, and "N=all fans of Taylor Swift in the same seconds" in FAANGS.

Yet I suspect most people will be working on "N=12, 5 hours a day during office hours, except twice a year." And I'm not sure what's the reference techniques for that.

arinlen3y ago

> Also, philosophically, I guess, a "distributed" systems starts at "two machines".

People opening a page in a browser that sends requests to a server is already a distributed system.

A monolith sending requests to a database instance is already a distributed system.

Having a metrics sidecar running along your monolith is already a distributed system.

1 more reply

User233y ago

> Yet I suspect most people will be working on "N=12, 5 hours a day during office hours, except twice a year." And I'm not sure what's the reference techniques for that.

Here you go: http://bofh.bjash.com/

random_coder3y ago

It's a joke.

HWR_143y ago

> we'll probably get just enough budget to buy a coffee to an AWS salesman.

You have it backwards. Salesmen will usually buy you the coffee. Even if you don't have the budget today, they still have an expense account and will usually buy you coffee.

salawat3y ago

Easiest starting point is modeliing the problem between you and your co-workers paying painstaking attention to the flow of knowledge.

Seriously. Most of the difficulty of distributed systems is because you're actually having to manage the flow of information between distinct members of a networked composite. Every time someone is out of the loop, what do you do?

Can you tell if someone is out of the loop? What happens if your detector breaks?

Try it with your coworkers. You have to be super serious on running down the "but how did you know" parts.

Once you have a handle of the way you trip, go hit the books, and learn all the names to the SNAFUs you just acted out.

srkiranraj3y ago

An introduction to scaling systems - https://youtu.be/a2rcgzludDU

qntty3y ago

Sounds like you could be looking for something like VMware vSphere if primary-backup replication is what you want

dinosaurdynasty3y ago

For a couple dozen thousand USD in licensing (vSAN licenses are expensive)

arinlen3y ago

I find this comment highly ignorant. The need to deploy a distributed system is not always tied to performance or scalability or reliability.

Sometimes all it takes is having to reuse a system developed by a third party, or consume an API.

Do you believe you'll always have the luxury of having a single process working on a single machine that does zero communication over a network?

Hell, even a SPA calling your backend is a distributed system. Is this not a terribly common usecase?

Enough about these ignorant comments. They add nothing to the discussion and are completely detach from reality

phtrivier3y ago

I failed to make the requester sound more obnoxious than the request.

My point is precisely that transitioning from a single app on a machine is a natural and necessary part of a system's life, but that I can't find satisfying resources on how to handle thise phase, as opposed to handling much higher load.

Sorry for the missed joke.

cwillu3y ago

Step 1: Make sure your backups are good enough to boot the second machine from, and test that that works once a month.

Step 2: There is no step 2.

polskibus3y ago

in my opinion, then first step should be thorough profiling and measurement under real load to decide which layer needs to be scaled out first. That should comprise a baseline for future comparisons, so you know whether the app is actually doing better than before and to prioritize your efforts.

qazpot3y ago

Designing Data-Intensive Applications is the best starting point.

phtrivier3y ago

I'm in the middle of it, and it's clearly a great piece of work, don't get me wrong ; however, I'm precisely wondering if there is a "Designing Data Non-Intensive Applications" book out there :)

rajeshp19863y ago

First step would be to move database to a separate server(3rd server) and use the 2 servers to run your application. This way you have some resiliency when one of the server goes down.

dang3y ago

Notes on CPSC 465/565: Theory of Distributed Systems [pdf] - https://news.ycombinator.com/item?id=11911402 - June 2016 (9 comments)

infogulch3y ago

15 pages of just TOC. 400+ pages of content

> These are notes for the Fall 2022 semester version of the Yale course CPSC 465/565 Theory of Distributed Systems

There are a lot of algorithms, but I don't see CRDTs mentioned by name. Perhaps it's most closely related to "19.3 Faster snapshots using lattice agreement"?

dragontamer3y ago

> CRDTs

Wrong level of abstraction. This is clearly a lower level course than that and discusses more fundamental ideas.

A quickie look through chapter 6 reminds me of CRDTs, at least the vector clock concept. Other bits from other parts of this course probably need to be combined into what would be called a CRDT.

yewenjie3y ago

Is there an overview of distributed systems that can be finished in one evening? Preferably in video format.

xuancanh3y ago

Martin Kleppmann's lecture series is the most concise one https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_H.... The total length is about 8 hours. You can finish it in one evening, depending on whether you sleep or not.

tychota3y ago

Why teach Paxos and not raft. I thought raft was easier to grasp, and is used a lot nowadays?.

dinosaurdynasty3y ago

Raft is an opinionated version of Paxos, they aren't fundamentally super different

quibono3y ago

Would you say one is used more than the other in the industry or is it more of a fair split?

j / k navigate · click thread line to collapse

45 comments

phtrivier3y ago

Is anyone teaching "Practice of boring Distributed Systems 101 for dummies on a budget with a tight schedule" ?

The original developpers left to start a Bitcoin scam.

Some exec read about the "cloud", but we'll probably get just enough budget to buy a coffee to an AWS salesman.

Don't even dream of hiring a "DevOps" to deploy a kubernetes cluster to orchestrate anything. Don't dream of hiring anyone, actually. Or, paying anything, for that matter.

You had one machine ; here is a second machine. That's a 100% increase in your budget, now go get us some value with that !

And don't come back in three months to ask for another budget to 'upgrade'."

Where would someone start ?

(EDIT: To clarify, this is a tongue in cheek hyperbole scenario, not a cry for immediate help. Thanks to all who offered help ;)

Yet, I'm curious about any resource on how to attack such problems, because I can only find material on how to handle large scale multi million users high availability stuff.)

keule3y ago

Usually, your monolith has these components: a web server (apache/nginx + php), a database, and other custom tooling.

> Where would someone start ?

Depending on the application design you can start introducing containers.

I would be interested in hearing if more people face this challenge. I don't know if guides exist around this on the webs.

smeagull3y ago

keule3y ago

Thank you for sharing the unforeseen depths of a monolith :D

thebeastie3y ago

Hey, this was useful, thanks.

throwaway7875443y ago

Here's the cliffs notes version for your situation:

1. Build a server. Make an image/snapshot of it.

2. Build a second server from the snapshot.

3. Use rsync to copy files your PHP app writes from one machine ('primary') to another ('secondary').

4. To make a "safe" change, change the secondary server, test it.

6. If you ever need to "roll back" a change, you can do that while there's still three servers up (blue/green), or deploy a new server with the last working snapshot.

7. Set up PagerDuty to wake you up if the primary dies. When it does, change the hostname of the first box to point to the IP of the second box.

You can get much more complicated but that's good enough for what they want (redundancy) and it buys you a lot of other benefits.

jiggawatts3y ago

I would not do this for a web app.

For web apps, you almost always want either of the following:

mindcrime3y ago

Is anyone teaching "Practice of boring Distributed Systems 101 for dummies on a budget with a tight schedule" ?

suprgeek3y ago

Then you can follow along parts 2, 3 &4 to scale up by factors of ~10 or more -

https://aws.amazon.com/blogs/startups/scaling-on-aws-part-2-... https://aws.amazon.com/blogs/startups/scaling-on-aws-part-3-...

fredsmith2193y ago

I can’t believe at 12 people would actually be stressing the system. Could you meet the requirements of the project by setting up the second machine as a hot back up at an offsite location?

phtrivier3y ago

My pet peeves with distributed and ops books is that they usually start by laying out all those problems, but then move on to either :

- or, not really explain anything, wave their hands chanting "trade offs trade offs" and start telling kids stories about Byzantine Generals.

lmwnshn3y ago

More entertainment than how-to guide, and oriented more towards developers than ops, but if you haven't read "Scalability! But at what COST?" [0], I think you'll enjoy it.

[0] https://www.frankmcsherry.org/graph/scalability/cost/2015/01...

arinlen3y ago

> explain how Big Tech has even bigger problems, before explainig how you can fix Big Tech problems with Big Tech budgets and headcound (...)

Because if you count yourself among those interested in the topic, that's precisely the type of stuff you're eager to learn so that you're in a better position to address those problems.

What's your answer to that? Continue writing "hello world" services with Spring Initializr because that's all you need?

2 more replies

EddySchauHai3y ago

What you’re describing there sounds like general Linux sysadmin to me?

1 more reply

edgyquant3y ago

I’m a bit crazy but taking an old monolith and slowly pruning and refining it into a codebase that would make a mathematician blush is soothing to me.

Then build an event logger that encompasses everything (every interaction at least) that happens on the front end (this is trivial with JavaScript on events.)

then spin up two nodes of it and write some function that merges two of these event trees (sorting by timestamp + pick a bias for when two events happen at the same time.)

It won’t scale to 1000 users, and you’ll find kinks to work out along the way. But this is a good start

slt20213y ago

distributed systems are usually for millions of users, not 12 users.

for your problem you can start by configuring nginx to work as load balancer and spin up 2nd VM with php app

phtrivier3y ago

"But what if _the_ machine goes down ? What if it goes down _during quarter earnings legally requested reporting consolidation period_ ? We need _redundancy_ !!"

Also, philosophically, I guess, a "distributed" systems starts at "two machines". (And you actually get most of the "fun" of distributed systems with "two processes on the same machine".)

We're taught how to deal with "N=1" in school, and "N=all fans of Taylor Swift in the same seconds" in FAANGS.

Yet I suspect most people will be working on "N=12, 5 hours a day during office hours, except twice a year." And I'm not sure what's the reference techniques for that.

arinlen3y ago

> Also, philosophically, I guess, a "distributed" systems starts at "two machines".

People opening a page in a browser that sends requests to a server is already a distributed system.

A monolith sending requests to a database instance is already a distributed system.

Having a metrics sidecar running along your monolith is already a distributed system.

1 more reply

User233y ago

> Yet I suspect most people will be working on "N=12, 5 hours a day during office hours, except twice a year." And I'm not sure what's the reference techniques for that.

Here you go: http://bofh.bjash.com/

random_coder3y ago

It's a joke.

HWR_143y ago

> we'll probably get just enough budget to buy a coffee to an AWS salesman.

You have it backwards. Salesmen will usually buy you the coffee. Even if you don't have the budget today, they still have an expense account and will usually buy you coffee.

salawat3y ago

Easiest starting point is modeliing the problem between you and your co-workers paying painstaking attention to the flow of knowledge.

Can you tell if someone is out of the loop? What happens if your detector breaks?

Try it with your coworkers. You have to be super serious on running down the "but how did you know" parts.

Once you have a handle of the way you trip, go hit the books, and learn all the names to the SNAFUs you just acted out.

srkiranraj3y ago

An introduction to scaling systems - https://youtu.be/a2rcgzludDU

qntty3y ago

Sounds like you could be looking for something like VMware vSphere if primary-backup replication is what you want

dinosaurdynasty3y ago

For a couple dozen thousand USD in licensing (vSAN licenses are expensive)

arinlen3y ago

I find this comment highly ignorant. The need to deploy a distributed system is not always tied to performance or scalability or reliability.

Sometimes all it takes is having to reuse a system developed by a third party, or consume an API.

Do you believe you'll always have the luxury of having a single process working on a single machine that does zero communication over a network?

Hell, even a SPA calling your backend is a distributed system. Is this not a terribly common usecase?

Enough about these ignorant comments. They add nothing to the discussion and are completely detach from reality

phtrivier3y ago

I failed to make the requester sound more obnoxious than the request.

Sorry for the missed joke.

cwillu3y ago

Step 1: Make sure your backups are good enough to boot the second machine from, and test that that works once a month.

Step 2: There is no step 2.

polskibus3y ago

qazpot3y ago

Designing Data-Intensive Applications is the best starting point.

phtrivier3y ago

I'm in the middle of it, and it's clearly a great piece of work, don't get me wrong ; however, I'm precisely wondering if there is a "Designing Data Non-Intensive Applications" book out there :)

rajeshp19863y ago

First step would be to move database to a separate server(3rd server) and use the 2 servers to run your application. This way you have some resiliency when one of the server goes down.

dang3y ago

Notes on CPSC 465/565: Theory of Distributed Systems [pdf] - https://news.ycombinator.com/item?id=11911402 - June 2016 (9 comments)

infogulch3y ago

15 pages of just TOC. 400+ pages of content

> These are notes for the Fall 2022 semester version of the Yale course CPSC 465/565 Theory of Distributed Systems

There are a lot of algorithms, but I don't see CRDTs mentioned by name. Perhaps it's most closely related to "19.3 Faster snapshots using lattice agreement"?

dragontamer3y ago

> CRDTs

Wrong level of abstraction. This is clearly a lower level course than that and discusses more fundamental ideas.

A quickie look through chapter 6 reminds me of CRDTs, at least the vector clock concept. Other bits from other parts of this course probably need to be combined into what would be called a CRDT.

yewenjie3y ago

Is there an overview of distributed systems that can be finished in one evening? Preferably in video format.

xuancanh3y ago

tychota3y ago

Why teach Paxos and not raft. I thought raft was easier to grasp, and is used a lot nowadays?.

dinosaurdynasty3y ago

Raft is an opinionated version of Paxos, they aren't fundamentally super different

quibono3y ago

Would you say one is used more than the other in the industry or is it more of a fair split?

j / k navigate · click thread line to collapse