https://ocistatus.oraclecloud.com/#/
I had to confirm the outage based on community reported down detector.
https://downdetector.com/status/oracle-cloud/
All of our services, instances and backups for https://searchadsoptimization.com are in Oracle cloud.
This shows a critical issue when relying on a single Cloud provider. It's time to build a cross cloud infrastructure design to handle these issues.
Update: It looks like they have updated their “real time” status page after good 25 minutes of severe outage. My trust and assumptions with real time status pages changed completely.
I don't understand the point of real time status pages if they are clearly not real time and not accurate.
My error notifications were blowing up my phone, the first thing I did is check their status page and assumed issue is within my application, and I couldn't even access my backend application. Out of desperation, I had to check downdetector to confirm the issue. I have formed new respect for downdetector.
My shitty, snarky comment aside, I am genuinely curious about why someone would choose Oracle as a cloud provider. If you look at their capex spend, it's undeniable they have so vastly underinvested in their cloud compared to AWS, Azure and GCP, that even if you were an "Oracle shop" I'm genuinely curious what benefits their cloud would offer.
Edit: Just want to say I really do appreciate the responses, lots of good info! I didn't know Oracle cloud offered a decent free tier, will take a look.
But I don't think that's the actual Oracle Cloud play, for the most part. To the best of my awareness, their cloud is realistically focused on hosted applications and SaaS - HCM Cloud, PeopleSoft Cloud, etc.
As such, their customers are not so much folks and small companies whose client-managed VMs may go down. Their customers are more likely to be large corporations whose Enterprise Resource Planning applications are fully hosted and may be impacted - Financials Management, Human Capital Management, Customer Relationships Management, etc.
I think for the most part Oracle Cloud does not have the same sales pitch and does not compete, for all practical sense, to AWS/Azure/GCP IaaS.
I could be wrong! There might be tons of clients who are renting bare VMs for Oracle! But to your point, I don't know why :P
I'm no fan of Oracle, but that's a good amount of free stuff for my hobby projects.
Engineer attrition rose. Services took longer to build and maintain. Company's stock went down. Layoffs ensued.
But the VP got promoted to SVP.
It's more than their free tier[1]. There are a number of nice things in OCI. For example: Redundant control plane hosts for their non-"free tier" k8s clusters are free. The equivalent of AWS's cross-AZ traffic is free (as opposed to $0.02/GB at AWS); a huge win for certain use cases. They're using a open, platform agnostic "specification" (the Fn Project[2]) for serverless cloud functions, which is wonderful for local dev and test. Terraform is tier 1 with OCI; from documentation through support Terraform is the reference "infrastructure as code" solution on OCI, always comprehensive and robust. Oracle Linux is pretty good; better than Amazon Linux has been, although with AL2023 Amazon is starting to close the gap. OCI instance shapes are very flexible. Overall costs are lower; Oracle is aggressively competing on price. Instance live migration (à la KVM live migrate) is a thing at OCI, so Oracle can live migrate running instances to isolate failing hardware.
I could go on.
Yes, I'd say OCI isn't as stable as AWS. Anecdotally: I get occasional "event notification" in my inbox; perhaps 4 in roughly 2 years, which is fewer (1) than I've seen from AWS in the same time. All but 1 was tangential, didn't actually impact anything that matters, and were quickly resolved. I actually received the OCI notice in my email today before it popped up on HN, which is "different" than how it usually goes with AWS.
[1] My experience is limited to AWS and OCI. [2] https://fnproject.io/
p.s. I appreciate that Oracle has earned the hate it receives from most, and I too have been a victim in my prehistoric past. OCI is, however, different; it's a pay-as-you-go platform that provides Oracle with no customer-abuse opportunities given the strength of the competitors, and my experience with it has been entirely cromulent.
Doesn't make it better, just makes for vasoline to make the shagging less painful.
It's not something I'd choose right away, but there are use cases where it can be cheaper/free compared to other options.
Bandwidth alone is an order of magnitude cheaper than AWS.
Edit: adding link [1] sorry for the paywall
[1] https://www.wsj.com/articles/the-ai-boom-runs-on-chips-but-i...
Maybe you should consider moving to a major cloud provider that has better services.
* Manage your own VPN and don't depend on the vendor's solution. * Only use base level services that are available on every cloud you want to leverage. Which probably means you're on a container based system and not using anything like Lambda.
Then a service being in another cloud just doesn't matter. You're always making requests via your VPN anyway, if it's in a local cloud or remote cloud it doesn't matter.
You'll feel the egress pricing if you integrate cross cloud services that are chatty. So it really depends on your immediate goals.
If you just want redundancy then you need to keep the resources standing in the backup cloud/AZ and just move your entry point via DNS, in which case you don't really need "cross cloud", just the ability to provision else where.
If you're operating at large scale you should have enough control over your own infrastructure to distribute load to multiple providers. Then if one of them is down, spin up more instances on another one.
If you're operating at small scale, store your backups on another provider, and periodically test that you can quickly restore them to that provider.
This isn't just about redundancy. Doing this is necessary to keep you from getting locked in.
Sure, but that's not what people mean when they say cross cloud. It usually implies running active workloads.
Orchestrate hundreds of workloads that each depend on one another across several clouds and they reorganize upon a failure introduces a lot of new failure modes.
FYI this is why we show real-time status on https://heiioncall.com/status including the time of the last inbound check-in or last HTTP probe.
I'm a fan of sticking with one provider, but going with something bigger that has a good track record. AWS, GCP, Azure aren't prone to 0 outages, but I think for almost all companies, having redundant stacks in separate regions is enough to maintain high availability.
I don't know enough about Oracle Cloud to comment on them, but my general take is these companies all inevitably hit a "showstopper" global outage, realize they aren't investing enough in separation of regional stacks enough, and put a ton of energy into making their platforms more fault tolerant.
Thinking that Johnny dev shop is going to be able to do better than a major player is, IMO, wishful thinking.
I know that at GCP at least, they actually have monitoring setup for things like tweets, downdetector, etc. Ideally they catch every issue with their own monitoring, but they do their best to know if anyone is having an issue, whether they can detect it or not..
We've identified a cooling system issue affecting multiple services in the US East (Ashburn) region. Our engineers are actively working to mitigate the issue.
My cousin mentioned their erp was down mid-day, and I laughed citing HN like "oh yeah, forgot you're a poor bastard oracle user." It was entirely dead, like everything apparently, most of the day. Sadly the financial people don't care, they will still cut a check to daddy Ellison monthly.
At least one large California municipality I worked with made a multi-year concerted effort to abandon the misery that is oracle erp. That said, never heard how that venture panned out with the replacement. Something about a frying pan to the fire comes to mind.
> Oracle Cloud Infrastructure Customer,
> Engineers and the colocation partner have successfully installed additional cooling systems to reduce ambient temperatures and mitigate the issue affecting multiple Oracle Cloud Infrastructure (OCI) services in the US East (Ashburn) region. We will continue to closely monitor this situation.
My error notifications were blowing up my phone, the first thing I did is check their status page and assumed issue is within my application, and I couldn't even access my backend application. Out of desperation, I had to check downdetector to confirm the issue. I have formed new respect for downdetector.
2. Previous point is ignored very often and outage is only made public when major clients or news organizations take notice and inquire.
It was before Oracle cloud but I literally was told to do things like that.