They've had a long history of this kind of stability issue when migrating or trying to migrate acquisitions from their previous stack to an MS one. This happened with Hotmail (Unix server -> Windows server), LinkedIn (custom cloud -> MS cloud) and others since.
Has it?
I’ve had hardly any problems. Occasional issues, but rarely have I been impacted to the extent I notice for more than say an hour…. maybe I notice it a couple times a year.
My internet access at home is more likely the issue when I hit GitHub issues.
Probably because there have been more high-profile stories of companies migrating off of Ruby on Rails to something else (e.g. Java, Go, etc) rather than vice-versa of migrating into it.
E.g. the high-profile story of Twitter's previous "whale fail" scaling problems supposedly being partially solved by switching from Ruby to Java/Scala/JVM : https://www.google.com/search?q=twitter+whale+fail+ruby+rail...
Ruby may be unfairly blamed but nevertheless, the narrative is already out there even though other big sites like Shopify, etc still use it.
Source: <-- OPs
Because you designed and implemented it poorly, that's why. Alternatively: How should I know, you wrote it.
If you're ever bored as a developer, switch to operations, you get to be the person developers turn to when they can't code, debug, do logging or security.
Calling it a "legacy rails" stack is incredibly disingenuous as well. It's not like they're running a 5 year old unsupported version of Rails/MySQL. GitHub runs from the Rails main branch - the latest stable version they possibly can - and they update several times per month.[^1] They're one of the largest known Rails code bases and contributors to the framework. Outside of maybe 37 Signals and Shopify they employ more experts in the framework and Ruby itself than any other company.
It's far more likely the issue is elsewhere in their stack. Despite running a rails monolith, GitHub is still a complex distributed system with many moving parts.
I feel like it's usually configuration changes and infra/platform issues, not code changes, that cause most outages these days. We're all A/B testing, canary deployments, and using feature flags to test actual code changes...
[^1]: https://github.blog/2023-04-06-building-github-with-ruby-and...
The culprit is change. Infra changes, config changes, new features, system state (os updates, building new images, rebooting, etc...), even fixing existing bugs all are larger changes to the system than most think. It's really remarkable at this point that Github is as stable as it is. It is a testament to the Github team they have been as stable as they are. It's not "rot" it's just a huge system.
It's not rails nor MySQL, both proven good for years.
"What do you mean the database is down after I loaded 500 TiB and indexed all columns? It's MySQL, Facebook uses MySQL a high scale for years without incident!"
Hugs to the GitHub ops team.
We just hope SMTP keeps ticking along somehow or we're fcuked.
External dependencies are always problem, but do you have the capacity and resources required to manage those dependencies internally? Most don't and will still get a better product/service by using an external service.
Local also means you can orchestrate maintenance windows to avoid outages at critical phases.
Define "most". There is a surprisingly high number of small/mid-sized companies which have dedicated people for this kind of things.
Sounds like all in good order then ...
This way your git repo could be located on: - GitHub - Your Closet (...) - UCLA's supercomputer - JBOD in Max Planck Institute (...) - GitLab
Doing this with a simple file with "[ipfs, github, gitlab]" on it would be revolutionary, especially for data version control, like nn weights or databases that are too large for git and cost too much on other services, as they would be free on ipf/torrent.
Then no one is phased by the inevitable failure of various companies.
If "ipfs" can be added as a remote, and it automatically pulls from thousands of different devices without having to specify them, that's a pretty big win for redundancy right?
Just expect GitHub to go down at least once every month as it is that unreliable.
This certainly has aged well: [0]
People really like avoiding ops
_Maybe it’s time for rewriting it in Rust._
Edit: RIIR was said in jest. I forgot HN doesn’t support markdown.
edit: fair enough
Sorry everyone!