For example they suggest ROS as a robust industry-ready software, which absolutely hasn't been my experience: you hire a bunch of domain experts to solve the various [hardware, controls, perception, imaging, systems] problems, but once you use ROS as your middleware you end up needing a bunch of ROS experts instead. This is due to the horrible build system, odd choice of defaults, instability under constrained resources, and how it inserts itself into everything. You end up needing more fine-grained control than ROS gives you to make an actually robust system, but by the time you discover this you'll be so invested into ROS that switching away will involve a full rewrite.
The same goes for further downstream: OpenCV images are basically a void* with a bunch of helper functions. (4.x tried to help with this but got sideswiped by DNN before anything concrete could happen.)
I guess it's the same rant the FreeBSD people have about the Linux ecosystem and its reliability. However I'd hope we raise our standards when it comes to mobile robotics that have the potential to accidentally seriously hurt people. And who knows, maybe one day OpenCV and ROS will pleasantly surprise me the way Linux has with its progress.
Also BY FAR not my experience.
Personally I see ROS as a cool thing for prototyping and research, but I see it certainly not as a serious solution for industry.
> This is due to the horrible build system, odd choice of defaults, instability under constrained resources, and how it inserts itself into everything.
This is an excellent short description of the main pain topics. I would add, that the need to work with 5 different languages (YAML, XML, MSG, CMAKE and C and/or Python) from scratch, makes it difficult for SME that are not software people to be productive in short time.
So true. The worst is that at many companies they will write a ROS clone in the end that does what they need, instead of getting rid of this awful programming paradigm altogether.
With regards to startups in general though. Having worked at a few Ive noticed that at the earliest stages the goal is for a few individuals to build quickly. Often this means certain framework choices that may not be suitable to scaling. As one scales one has to then evolve the architecture to ensure developer velocoty. This may mean rewriting everything. Im not surprised that people are rewriting ROS internally. At the end of the day there are a few good ideas in there, but at some point one has to acknowledge that implementations were lacking.
Personally if one were to write a middleware framework in 2024 Id go with rust, mcap, zenoh, rerun and possibly use ecs instead of topics.
- We have moved to standard sockets IPC, or protocol buffers. - For logging just trivial printing. - For configuration to libconfig - For launch system to simple shell scripts.
Another phrase I see spreading rapidly is "double-click that" replacing "drill into that". Seems like every podcast instantly adopted "double-click" lingo in the last 2 years.
Because if you're not using the newest lingo, you might not be using the newest approaches, and might be falling behind! (gasp!)
In my corporate experience, the main thing about lessons was whether they were 'identified' or actually 'learned', e.g. following a corporate post-mortem of some cockup/fiasco/disaster.
...until a few years ago when:
> Computer vision has been consumed by AI.
...but "AI" is an unsatisfying reduction. What does it even mean? (and c'mon, plenty of non-NN CV techniques going back decades can be called "AI" today with a straight-face (for example, an adaptive pixel+contour histogram model for classifying very specific things).
My point is that computer-vision, as a field, *is* (an) artificial-intelligence: it has not been "consumed by AI". I don't want ephemeral fad terminology (y'know... buzzwords) getting in the way of what could have been a much better article.
Today the word "AI" has itself been hijacked by marketers of ANN-based techniques, so when the article uses that term, it confuses people who don't know any better.
A lot of our visual perception happens on the retina and in the 'processing pipeline' before reaching the brain.
Margaret Livingstone provides an excellent overview in her book "Vision and Art" and she takes a view similar to yours.
The benefits of all this spy tech are great - if we manage it right. I mean telephone tapping is an example
Example: "I have made this technology that makes me (and only me) earn 1 cent per day when managed right. Bad actors can use it to hack into hospitals and ransom them".
Wouldn't you agree that this hypothetical technology is probably not worth it?
In my opinion, it's not acceptable to say "I created a technology that helps bad guys be even worse, but I don't have any responsibility at all because I don't personally ask them to be bad guys".
Of course sometimes it's harder than one may think a posteriori: sometimes it was not clear while developping the technology that it would have bad applications.