I think that what is going there is also a bit political. They started to grow their Engineering department so fast that they need to justify the headcounts now. So each team is trying to invent new projects all the time. Anecdotally, this was partially confirmed to me by a friend working there.
I said this before, but I still cannot understand why a service like Uber need so many engineers in the backend (multiple thousands). It is a complex distributed application, but nowhere near the scale or complexity of a Facebook or Google.
Thank you so much, I thought I was going crazy. I understand the demands of running a service on the level Uber has, but well, for instance I can't imagine what kind of computational workload / infrastructure requirements would make developing your own resource scheduler a reasonable option - for a Taxi app? With non-essential (to the core product) machine learning?
Forgive me if I'm ignorant, but what exactly does Uber engineering team do?
edit: On their blog I was able to find that they namely "forecast rider demand", from a relatively small [0] article - that is, comapred to the article [1] about what essentially is "just" data visualization, which doesn't help my confusion much.
0 - https://eng.uber.com/neural-networks/ 1 - https://eng.uber.com/maze/
Sure, they just reinvented Dapper from Google... but unlike Dapper I can download and use Jaeger. That counts for a lot. Do I use their ride sharing service? Nope. But I do like their open source projects.
* Lots of realtime
* Resource scheduling problems
* Route optimization problems (especially with shared uber or shared lyft rides)
Something similar happened in LinkedIn too I guess. Multiple teams building very similar tools that were on very related problems.
_________________
I'm assuming like all mapping solutions it'll get better but for now, it's just full of bad routes, over-optimizing turns, out-of-date detours (for MONTHS!) and non-sensical U-turns
Maybe the rewrites are risk management?
Many internal projects that eventually become open-source often are not NIH projects because when the project was proposed their may have been no public open source projects or at least none that is mature enough. Even if something exists but it still in its early stages, it presents a lot of risk because your company isn't in the driver seat building and maintaining it.
Claiming something is NIH based on when it first became polished enough to be open-source ignores all the history behind the state of the world when a project was first worked on.
I'm a little sad that this is the top comment here. I mean, maybe you're right. But so what? Some people find this useful, and some won't. Same as anything else.
At the end of the day, every line of code added to the world's pool of OSS code is a Good Thing™ as far as I'm concerned. Even if it's something I personally don't have a use for.
I think we should encourage companies to release code as open source, and give Uber at least some small measure of "props" for the stuff they release. Maybe none of their stuff is a game changer like Linux, but it doesn't need to be.
- Route Optimization
- Demand Forecasting
- Rider Hotspot Prediction
This post doesn't exactly tell us the true nature of their workloads (other than the crude categorization - batch, stateful, stateless), nor does it talk about the inflection points where off-the-shelf solutions don't cut it anymore and such customization is required. I mean some before & after numbers / graphs on resource utilization would have really helped.
Is this thing/Apache Mesos abstract enough to allow for such a use?
> Added notice that the project is dead
https://github.com/cmu-db/peloton/commit/484d76df9344cb5c153...
Does Uber get held to the same standard or do we just assume all names are overloaded now?
I have no idea if or why they would've used that or if they're just referring to the cycling thing, but I guess "fearless" could also be kind of fitting for this project.
In the OP blog post though, they assert "to our knowledge, there is no other open source scheduler which combines all types of workloads for web-scale companies like Uber."
And then, when you dig...it's just Mesos. They built a framework for Mesos. So, that's cool. But man, the puff piecery borders on dishonesty. I mean--Singularity has existed, and is implemented at very large scales, for a while. I'm sure Peloton is a fine scheduler, but there's a lot of huffing-one's-own-farts in the documentation here.
edit: clearly I'm thinking of a different Singularity than you.
EDIT People get so up in arms about Google and Microsoft working with China and the military, but Uber has done some horrendous stuff on their own. Just curious where people think the line is OK to be.