undefined | Better HN

0 pointschrismorgan4y ago0 comments

No it doesn’t. From your link:

• 9 web servers

• 4 SQL servers

• 2 Redis servers

• 3 tag engine servers

• 3 Elasticsearch servers

• 2 HAProxy servers

That comes to 23. I know “a couple” is sometimes used to mean more than two, but… not that much more than two.

“A couple” is just flat-out wrong; I’d guess that he’s misinterpreting ancient figures, taking the figures from no later than about 2013 about how many web servers (ignoring other types, which are presently more than half) they needed to cope with the load (ignoring the lots more servers that they have for headroom, redundancy and future-readiness).

0 comments

fabian2k4y ago

One interesting aspect is that the number of servers is much higher than what would actually be needed to run the site, most servers run at something like 10% CPU or lower. Most of the duplication is for redundancy. As far as I remember they could run SO and the entire network on two web servers and one DB server (and I assume 1 each of the other ones as well).

If someone says SO runs on a couple servers this might be about the number actually necessary to run it with full traffic, not the number of servers they use in production. This is a more useful comparison if the question is only about performance, but not that useful if you're comparing operating the entire thing.

p_l4y ago

IIRC, without emergency redeploying, they might have issue running less than 4 - not sure if the tag server can coexist with web server anymore for example, redis is still a dependency, so is haproxy, separated SQL and IIS, etc.

Then there's support services (iirc, all of elasticsearch was non-functional requirements stuff and technically could be run without?) and HA.

lamontcg4y ago

23 is not a lot of servers.

That is still doable with mid-90s era hand management of servers (all named after characters in lord of the rings).

Not that you should, but you could.

And the growth rate must be very low and pretty easy to plan out your O/S upgrade and hardware upgrade tempo.

And it was actually possible to manage tens of thousands of servers before containers. The only thing you really need is what they now call a "cattle not pets" mentality.

What you lose is the flexibility of shoving around software programmatically to other bits of hardware to scale/failover and you'll need to overprovision some, but even if half of SOs infrastructure is "wasted" that isn't a lot of money.

And if they're running that hardware lean in racks in a datacenter that they lease and they're not writing large checks to VMware/EMC/NetApp for anything, then they'd probably spend 10x the money microservicing everything and shoving it all into someone's kubernetes cloud.

In most places though this will fail due to resume-driven design and you'll wind up with a lot of sprawl because managers don't say no to overengineering. So at SO there must be at least one person in management with a cheap vision of how to engineer software and hardware. Once they leave or that culture changes the footprint will eventually start to explode.

manigandham4y ago

Most of that is extra unused capacity. They've shared their load graphs and past anecdotes where it's clear the entire site runs very lean.

Also 23 is very much a couple for a company and application of that size. It's not uncommon to see several hundred or thousands of nodes deployed by similar sites.

TheCoelacanth4y ago

Two of their servers have 1.5 TB of RAM each. Just one of those nodes is probably as powerful and expensive as 100 nodes in a thousand node setup.

They aren't magically more efficient than other sites. They just chose to scale vertically instead of horizontally.

manigandham4y ago

> "They aren't magically more efficient than other sites"

It's certainly not magic but good architecture decisions and solid engineering. This includes choosing SQL Server over other databases (especially when they started), using ASP.NET server-side as a monolithic app with a focus on fast rendering, and yes, scaling vertically on their own colo hardware. The overall footprint for the scale they serve is very small.

It's the sum of all these factors together, and it absolutely makes them more efficient than many other sites.

szszrk4y ago

Exactly. That twitter thread is just pure rage based on no data. Sum up resources from that page - we are talking around 6500GB* of RAM worth of servers. That is no homelab.

* Maybe a bit more/less, because it's not clear to me if DB RAM is per server, or per cluster. Likely server, as on other servers. There is also no data on how big is their haproxy.

slooonz4y ago

And yet the main point stand : they don't need K8s to manage this application running on 23 servers.

szszrk4y ago

No one needs k8s. Bringing up their infrastructure in a k8s troubleshooting how-to was a weird thing to do in the first place. It's comparing apples and chandelier - makes no sense.

They have a typical vertically scaled infrastructure, most services have just two nodes, one active. The biggest ones are databases which in many companies are handled in "the classic way" anyway. Clearly it's not designed as microservices and doesn't need dynamic automation at all. Why on earth would they even bring k8s up in their plans?

OJFord4y ago

And yet it wouldn't be out of place either.

chrismorganOP4y ago

Nevertheless, it is true that Stack Overflow has focused on backend performance and scaled vertically a long way, further than is fashionable. Just not so far as only using two servers for everything.

sofixa4y ago

Because they're constrained by Windows and Microsoft licensing, scaling out was never an easy option for them.

3 more replies

j / k navigate · click thread line to collapse

0 comments

fabian2k4y ago

p_l4y ago

Then there's support services (iirc, all of elasticsearch was non-functional requirements stuff and technically could be run without?) and HA.

lamontcg4y ago

23 is not a lot of servers.

That is still doable with mid-90s era hand management of servers (all named after characters in lord of the rings).

Not that you should, but you could.

And the growth rate must be very low and pretty easy to plan out your O/S upgrade and hardware upgrade tempo.

And it was actually possible to manage tens of thousands of servers before containers. The only thing you really need is what they now call a "cattle not pets" mentality.

manigandham4y ago

Most of that is extra unused capacity. They've shared their load graphs and past anecdotes where it's clear the entire site runs very lean.

Also 23 is very much a couple for a company and application of that size. It's not uncommon to see several hundred or thousands of nodes deployed by similar sites.

TheCoelacanth4y ago

Two of their servers have 1.5 TB of RAM each. Just one of those nodes is probably as powerful and expensive as 100 nodes in a thousand node setup.

They aren't magically more efficient than other sites. They just chose to scale vertically instead of horizontally.

manigandham4y ago

> "They aren't magically more efficient than other sites"

It's the sum of all these factors together, and it absolutely makes them more efficient than many other sites.

szszrk4y ago

Exactly. That twitter thread is just pure rage based on no data. Sum up resources from that page - we are talking around 6500GB* of RAM worth of servers. That is no homelab.

* Maybe a bit more/less, because it's not clear to me if DB RAM is per server, or per cluster. Likely server, as on other servers. There is also no data on how big is their haproxy.

slooonz4y ago

And yet the main point stand : they don't need K8s to manage this application running on 23 servers.

szszrk4y ago

No one needs k8s. Bringing up their infrastructure in a k8s troubleshooting how-to was a weird thing to do in the first place. It's comparing apples and chandelier - makes no sense.

OJFord4y ago

And yet it wouldn't be out of place either.

chrismorganOP4y ago

sofixa4y ago

Because they're constrained by Windows and Microsoft licensing, scaling out was never an easy option for them.

3 more replies

j / k navigate · click thread line to collapse