The AWS team keeps touting the rock solid reliability of AWS as a reason why we shouldn’t diversify our cloud. Should be a fun meeting!
I'd suggest to ++double the cost. Compare:
++double: spoken as "triple" -> team says that double++ was a joke, we can obviously only double the cost -> embarrassingly you quickly agree -> team laughs -> team approves doubling -> you double the cost -> team goes out for beers -> everyone is happy
double++: spoken as "double" -> team quickly agrees and signs off -> you consequently triple the cost per c precedence rules -> manager goes ballistic -> you blithely recount the history of c precedence in a long monotone style -> job returns EINVAL -> beers = 0
This is all very, very hand-wavey. And if one says "golly gee, all our config is too cloud specific to do multi-cloud" then you've figured out why cloud blows and that there is no inherent reason not to have API standards for certain mature cloud services like serverless functions, VMs and networks.
Edit to add- ink ow how grossly simplified this is, and that most places have massively complex systems.
If you have an app that experiences 1000x demand spikes at unpredictable times then sure, go with the cloud. But there are a lot of companies that would be better off if they seriously considered their options before choosing the cloud for everything.
Yep. Although it's just anecdata, it's what we do where I work - haven't had a slightest issue in years.
It's amazing how few problems we have. Honestly, I don't think we have to worry about configuration issues as often as people who rely on the cloud.
Yes, mostly.
Maybe those who have been around longer have seen this before, but its the first time for me.
I found this summary:
https://fortune.com/2025/07/31/amazon-aws-ai-andy-jassy-earn...
And the transcript (there’s an annoying modal obscuring a bit of the page, but it’s still readable):
https://seekingalpha.com/article/4807281-amazon-com-inc-amzn...
(search for the word “tough”)
The best advice I can give to any org in AWS is to get out of us-east-1. If you use a service whose management layer is based there, make sure you have break-glass processes in place or, better yet, diversify to other services entirely to reduce/eliminate single points of failure.
This is not a new issue caused by improper investment, it's always been this way.
It's both the oldest and largest (most ec2 hosts, most objects in s3, etc) AWS region, and due to those things it's the region most likely to encounter an edge case in prod.
This is and was never true. I've done setups in the past where monitoring happened "multi cloud" with also multiple dedicated servers. Was pretty broad so you could actually see where things broke.
Was quite some time ago so I don't have the data, but AWS never came out on top.
It actually matched largely with what netcraft.com put out. Not sure if they still do that and release those things to the public.
It really is a single point of failure for the majority of the Internet.
Why would a third-party be in your product's critical path? It's like the old business school thing about "don't build your business on the back of another"
The reason third-party things are in the critical path is because most of the time, they are still more reliable than self-hosting everything; because they're cheaper than anything you can engineer in-house; because no app is an island.
It's been decades since I worked on something that was completely isolated from external integrations. We do the best we can with redundancy, fault tolerance, auto-recovery, and balance that with cost and engineering time.
If you think this is bad, take a look at the uptime of complicated systems that are 100% self-hosted. Without a Fortune 500 level IT staff, you can't beat AWS's uptime.
i bet only 1-2% of AI startups are running their own models and the rest are just bouncing off OpenAI, Azure, or some other API.
Good luck naming a large company, bank, even utility that doesn't have some kind of dependency like this somewhere, even if they have mostly on-prem services.
If an internal "AWS team" then this translates to "I am comfortable using this tool, and am uninterested in having to learn an entirely new stack."
If you have to diversify your cloud workloads give your devops team more money to do so.