We originally left GitLab for GitHub after being bit by a major outage that resulted in data loss. Our code was saved, but we lost everything else.
But that was almost 10 years ago at this point.
My only real current complaint is that the webhooks that are supposed to fire in repo activity have been a little flaky for us over the past 6-8 months. We have a pretty robust chatops system in play, so these things are highly noticeable to our team. It’s generally consistent, but we’ve had hooks fail to post to our systems on a few different occasions which forced us to chase up threads until we determined our operator ingestion service never even received the hooks.
That aside, we’re relatively happy customers.
They are pretty good, in my experience, at *eventually* delivering all updates. The outages take the form of a "pause" in delivery, every so often... maybe once every 5 weeks?
Usually the outages are pretty brief but sometimes it can be up to a few hours. Basically I'm unaware of any provider whose webhooks are as reliable as their primary API. If you're obsessive about maintaining SLAs around timely state, you can't really get around maintaining some sort of fall-back poll.
[0] https://status.gitlab.com/pages/history/5b36dc6502d06804c083...
Haven't seen any outage from GitLab in like, ever.
To Gitlab's credit their observability seems to be good, and they do a good job communicating and resolving incidents quickly.
Some companies that shall not be named have status pages that always show green and might as well be a static picture. Some use words like "some customers may have experienced partial service degradation" to mean "complete downtime". Gitlab also has incidents, but they're a lot more trustworthy. You can just open the issue tracker and there's the full incident complete with diagnosis.
PS: None of our 40+ engineers felt anything, our self hosted Forgejo is as snappy as ever.
Or whatever else, software services going down is going to happen in some capacity, eventually. Real question is what is acceptable
There were so many severe Github Actions outages (10+ ?) in the past year. Cause: Migration to the disaster zone also known as Azure, I assume. Most of them happened during (morning) CET working hours, as to not inconvenience the americans and/or make headlines.
Money doesn't buy competency. It's a long-term culture thing. You can never let go on maintaining competency in your organization. It rots if you do. I guess Microsoft did let go.
GitHub as a whole, including the previously non-Azure bits, does seem flakier than a few years ago though, for sure.
“ Referring sites and popular content are temporarily unavailable or may not display accurately. We're actively working to resolve the issue.”
It’s been like that for months now with no sign of anyone working on it. They just don’t care about user experience anymore.
[1] https://thenewstack.io/github-will-prioritize-migrating-to-a...
Honestly I don't know half the features they have added because the surface is huge at this point everyone seems to be using a (different) subset of them anyway.
So a feature freeze isn't likely to have much impact on me.
EDIT: went and checked - https://github.blog/news-insights/github-is-moving-to-racksp... not sure if they moved again before the MS acquisition though.
> GitHub is currently hosted on the company’s own hardware, centrally located in Virginia
I imagine this predates their acquisition from Microsoft. Honestly, given how often Github seems to be down compared to the level of dependency people have on it, this might be one of the few cases where I might have understood if Microsoft embraced and extended a bit harder.
[1]: https://www.theverge.com/tech/796119/microsoft-github-azure-...
Absolutely not.
and/or has to pay the SLA out of their budget
Are they using AI agents this time to resolve the outage? Probably not.
But this time, there is no CEO of GitHub to contact and good luck contacting Satya to solve the outage.
Any time their startup competitors are making too much progress they can just push the "GitHub incident" button and slow everyone down.
I'm not sure the new school cares nearly as much. But then again this is how companies change as they mature. I saw this with StubHub as well.. The people who care the most are the initial employees, employee #7291 usually dgaf
I simply want to survive. I'll kiss ass where I have to, but not to people I don't work on behalf of.
Though yeah, for startups who depend on GitHub for CI and CD, it's been noticeable how absurdly unreliable GitHub has become over the years.