Is this a trend I’ve missed … some sort of post-D3 nadir of a datavis hype curve or something where graphs are a cringey thing SEOd click bait articles or news pages do?
But now that you mention it, it does seem a little odd.
The world has enough quantity; let’s focus on quality.
Federated privacy preserving learning, local models, etc. all help keep your private data on your devices. Good stuff.
Suppose I had one or two cameras attached to a computer and ran a software that would detect which object I'm pointing at and name it, how much power would that use?
The human brain would probably need around 0.5s - 1s to come up with an answer, consuming around 5 milliwatt hours of energy in that time.
How much power would the computer need to at least give it a fair shot compared to the human?
If we assume that a human is pretty close to the best theoretically achievable limit of overall usefulness vs energy usage (while, unlike current AI, having the ability to learn ad-hoc, self-correct and maintain itself), "work per watt" may give us an idea of how advanced our current technology really is compared to what already existed, and how far we can still go.
A camera with a nvidia jetson might consume 0.5 kwh per day and run nonstop.
Ultimatly it is Apples to Oranges. A human brain can do a lot more than simply classify an object. Security guards watching cameras are evaluating the situation, not annotating images.
A human body can extract up to ~95% of the energy in that food (depends on the food), which is pretty damn efficient. You may have seen the number 20% thrown around, but that refers to how much of that can be turned into useful mechanical energy.
> and requires ~2-3kwh (depending on weight, sex etc.) of energy per day and cannot operate 24/7/365.
More like a fifth of that energy (which is what the brain uses). If you're going to look at the entire body, you're going to have to match those features in your hardware. I don't think current-gen hardware that could conceivable repair itself and take care of its own needs while being as energy efficient as a human body.
> A camera with a nvidia jetson might consume 0.5 kwh per day and run nonstop.
The nano?
If I had trained a model on the full Open Images Dataset (so we can get a number of categories that at least approaches what a human could do) are you sure that's going to cut it?
YOLOv3 doesn't even reach 2 fps on the nano (YOLOv3-tiny gets more, but using a crippled version won't win us any prizes), and that one only has 80 categories. The Open Images Dataset has five times that - which is still absolutely nothing compared to what a human can do (and the dataset is also a bit odd: the only specific street sign it knows is "stop sign" and there's weird one-offs like "facial tissue holder" but it can't tell a ferris wheel from a car wheel or steering wheel).
Even if you somehow managed to fit something with such a number of categories and acceptable accuracy on a nano, it would probably blow its energy budget, which is about 2 seconds of operation if it wants to match a human.
> Ultimatly it is Apples to Oranges. A human brain can do a lot more than simply classify an object.
Sure, but it's also not going to perform a lot of tasks at the same time. If you ask a human to keep classifying anything you're pointing at, they'll be mostly busy watching you and what you're pointing at, trying to conjure up the appropriate word to name the thing. If not you're not pointing fast enough.
Though I suppose we also have some sort of passive classification mode that we're using most of the time while we do other things. This mode just deals with concepts - it doesn't bother to inform us the thing flying at us is called "ball".
You'll want to be running a huge state-of-the-art network trained on large datasets on it to approach human capabilities and I don't think 2.5TFLOPS will cut it.
I had a look around and this thing is probably more in the right ballpark: https://www.nvidia.com/en-us/autonomous-machines/embedded-sy...
It uses up to 60W for 270TFLOPS at full power, but its processing power should be in the right ballpark to at least do decently with something trained on the best datasets there are.
There's a chance much smaller hardware would do if only our software was advanced enough, but it's probably not. I'm not sure where we are really at, hence my original question. You'd need to somehow work out Watts/HumanPerformance.