I remember we had a power outage in 2006, it actually took one of my services off air. Since then of course that has been rectified, and the loss of a building wouldn't impact on any of the critical, essential or important services I provide.
Source? Has there ever been an industry wide survey that compares availability from "insert average colo/data center operations" with the cloud ones?
And I'm not talking about "we have 12 SREs who are based in Cupertino and are all paid top dollar to support a colo"...I'm talking average.
I worked through the ranks at a large enterprise that ran a “big” datacenter for a decade. The facilities team was about 6 people, average salary around $90k. I can only remember one power interruption affecting more than a rack, caused by a failure during a maintenance event that requires a shutdown for safety reasons. The rest is like any other industrial facility - you have service contracts for the equipment, etc and maintain things.
There’s a cost/capability curve that you need to plan around for these matters. You need to make business and engineering decisions based on your actual circumstances. If the answer is automatically “AWS <whatever>“, you’re making a decision to burn dollars for convenience.
Ok so $540k salaries + benefits, so ~$700k. Then you have transaction costs:
- Annual salary increases
- Any cost associated with people leaving (severance, hiring, recruiters, HR, HR systems)
- Systems that run in the data center (logging, monitoring, etc.)
- Procurement costs with changing costs in hardware (silicon shortages, etc.)
- Security compliance overhead and associated risks
- Finance resources required to capitalize and manage asset allocation
- etc. etc.
Versus
- Click a button and voila it works.
- Hire way less engineers to manage the system administrative portion
> If the answer is automatically “AWS <whatever>“, you’re making a decision to burn dollars for convenience.
100% AGREE. The answer is always "it depends", but just like people are saying "just put in the cloud", the opposite of "well it worked for us using a data center" isn't that simple.
I’ve been deploying to AWS for years and can’t remember and outage on their side in my region. But this is anecdotal and doesn’t necessarily reflect the statistics.
It is as if the software industry has collectively forgotten how to run basic data center operations. Something that used to be a blue collar skill is now treated like arcane magic.
It's not arcane magic. It's undifferentiated toil that requires hiring for a different skill set than tech companies generally want to hire for. Of course when you get to a certain size it may make sense to take on this cost.
I want us to stop pretending individual actors lack the agency to make their own decisions, and they're all blind to how AWS is charging them a fortune for such simple things they can do themselves. You get value from AWS or you stop using AWS.
And what rate is this? It gets attention because it impacts more people, but AWS / GCP / Azure uptime is still better than what I've seen for small / mid size businesses trying to manage their own infrastructure.
So again, we're talking about cloud providers because of their scope and size, and they're still doing better than MidSizedBank managing their own infrastructure.