People obviously still see value in discussing it
Google Photos in 2015: https://www.wired.com/story/when-it-comes-to-gorillas-google...
Flickr in 2015: https://www.independent.co.uk/life-style/gadgets-and-tech/ne...
Facebook, like a lot of tech companies, has long had problems with diversity in engineering. Here's an article from April that discusses specific incidents and the broader background: https://www.washingtonpost.com/technology/2021/04/06/faceboo...
I've struggled with people telling me that these FAANG companies have "diversity problems," as a person of color myself. A majority of software engineers are female and male immigrants from East Asia and South Asia. These population centers are some of the most diverse regions of the world. The engineers who have been hired by preparing for and passing these companies' selective merit based coding tests had to overcome adverse conditions in their home countries as well, including extreme poverty, starvation, and totalitarian regimes.
Why do they not count toward diversity, to some white and white-adjacent critics? What message are we sending to people who are ethnic minorities from certain groups who earned their spots through merit and have also been targeted in recent newsworthy attacks, just as others have, when we make these kinds of accusations? What does a non problematic ethnic composition look like? What are these companies doing right toward some minority groups and wrong towards others?
If that is the case, why is it that Google voice nav routinely butchers the names of places and roads in India in spite of having thousands of Indian engineers on staff?
Could we blame the intractability of the problem, or just plain old incompetence, before we blame every single problem in the world on racism and lack of 'diversity'?
Silly Google TTS, the proper pronunciation is obviously "Malcolm the Tenth" there.
Once you search for these:
https://www.google.com/search?q=human+female+face&tbm=isch
https://www.google.com/search?q=human+male+face&tbm=isch
You can see that 'human face' has a bit of post-hoc tuning.
There's no super reliable way to prevent this (with current tech) other than forbidding that output entirely.
https://i.ibb.co/Mf6rVdf/Screenshot-20210907-002516-Photos.j...
Nobody who has traveled at all would mistake my wife and child as Japanese. And doing so is especially insidious considering the Bataan death march.
They probably are, but not good enough. These things can be surprisingly hard to detect. Post hoc it is easy to see the bias, but it isn't so easy before you deploy the models.
If we take racial connotations out of it then we could say that the algorithm is doing quite well because it got the larger hierarchical class correct, primate. The algorithm doesn't know the racial connotations, it just knows the data and what metric you were seeking. BUT considering the racial and historical context this is NOT an acceptable answer (not even close).
I've made a few comments in the past about bias and how many machine learning people are deploying models without understanding them. This is what happens when you don't try to understand statistics and particularly long tail distributions. gumboshoes mentioned that Google just removed the primate type labels. That's a solution, but honestly not a great one (technically speaking). But this solution is far easier than technically fixing the problem (I'd wager that putting a strong loss penalty for misclassifiying a black person as an ape is not enough). If you follow the links from jcims then you might notice that a lot of those faces are white. Would it be all that surprising if Google trained from the FFHQ (Flickr) Dataset?[0] A dataset known to have a strong bias towards white faces. We actually saw that when Pulse[1] turned Obama white (do note that if you didn't know the left picture was a black person and who they were that this is a decent (key word) representation). So it is pretty likely that _some_ problems could simply be fixed by better datasets (This part of the LeCunn controversy last year).
Though datasets aren't the only problems here. ML can algorithmically highlight bias in datasets. Often research papers are metric hacking, or going for the highest accuracy that they can get[2]. This leaderboardism undermines some of the usage and often there's a disconnect between researchers and those in production. With large and complex datasets we might be targeting leaderboard scores until we have a sufficient accuracy on that dataset before we start focusing on bias on that dataset (or more often we, sadly, just move to a more complex dataset and start the whole process over again). There's not many people working on the biased aspects of ML systems (both in data bias and algorithmic bias), but as more people are putting these tools into production we're running into walls. Many of these people are not thinking about how these models are trained or the bias that they contain. They go to the leaderboard and pick the best pre-trained model and hit go, maybe tuning on their dataset. Tuning doesn't eliminate the bias in the pre-training (it can actually amplify it!). ~~Money~~Scale is NOT all you need, as GAMF often tries to sell. (or some try to sell augmentation as all you need)
These problems won't be solved without significant research into both data and algorithmic bias. They won't be solved until those in production also understand these principles and robust testing methods are created to find these biases. Until people understand that a good ImageNet (or even JFT-300M) score doesn't mean your model will generalize well to real world data (though there is a correlation).
So with that in mind, I'll make a prediction that rather than seeing fewer cases of these mistakes rather we're going to see more (I'd actually argue that there's a lot of this currently happening that you just don't see). The AI hype isn't dying down and more people are entering that don't want to learn the math. "Throw a neural net at it" is not and never will be the answer. Anyone saying that is selling snake oil.
I don't want people to think I'm anti-ML. In fact I'm a ML researcher. But there's a hard reality we need to face in our field. We've made a lot of progress in the last decade that is very exciting, but we've got a long way to go as well. We can't just have everyone focusing on leaderboard scores and expect to solve our problems.
[0] https://github.com/NVlabs/ffhq-dataset
[1] https://twitter.com/Chicken3gg/status/1274314622447820801
[2] https://twitter.com/emilymbender/status/1434874728682901507
i wonder how testing for that looks and sounds in corporate environment. It may as well be an area similar to patents - you pretend that you never heard, never discussed, God forbid any mentioning in corporate email/chat/etc. or clicking on a link from inside a corporate network,...
Have we considered AI and ML as a general brain replacement is a failed idea? That we humans feel we are so smart we can recreate or exceed millions of year evolution of a human brain?
I'd never call AI a waste, it's not. But getting it to do human things just may be.
Even a child can tell the difference between a human of any color and an ape. How many billions have been spent trying, and failing, to exceed the bar of the thoughts of a human child?
I took a photo of the water pump from a car windscreen wiper and google was able to correctly identify what it was. I took a photo of a generic PCB which showed the back of a driver board for an LCD and google was able to bring up the exact type of board it was with the names of the ICs on it.
In these examples, google photos ai has far exceeded what the average human can achieve. We just have to keep in mind that these systems are not perfect and only a best guess which should be verified by a person later.
The problem here is not that the mistake was very costly or disruptive to the function of the feature, but that the mistake was highly offensive which is something very hard to avoid.
The problem it's solving is that it can do things that somebody with zero experience cannot. If you had an auto parts pro, or an EE, they probably could have done the same for you.
So, in general, AI is helpful because it has a much larger breadth of knowledge. Granted.
But I want examples of it doing depth, too.
My wife uses Lens when we fish. It's way, way worse than a fisherman with any experience at all.
Yes. It is currently known to fail at this prospect. It is an open research question as to whether current methods can be merely "scaled up" using more compute to achieve "general brain replacement". I personally am skeptical about that considering basic problems such as concept drift (but I am by no means an expert).
You define what constitutes as valuable to be arbitrarily difficult/inconceivable with current methods (because it's an area of open research) and then say we should divert course merely because we don't know it's possible?
> never call AI a waste, it's not. But getting it to do human things just may be.
It already can do things thought to be previously exclusively "human" (such as beating Go). Recently it also helped make significant advancements for protein folding which are sure to yield benefits to medical science at least indirectly. I believe this statement is either incorrect, or you're expecting people to have some strange definition of "exclusively human", which is of course also open research and unanswered.
Humans and machines are so different today. Of course machines beat us at number calculations and such. But we have organs that computers don't and can't have. And our brains are much more in tune with using those than power of 2 bit twiddling.
As we ourselves don't understand how it works, how can we ever write a machine that does?
Taken to the extreme, AI code is essentially something like:
add(M, N) {
return M + N + rand();
}
In addition, being tested with a (in relation to the complete set) very small set of input data.Maybe to your typical SGD-type algorithm, working off a dataset filled with mostly light skin toned people, skin tone just looks like a real solid first-order way to distinguish humans and primates, and picking up the black people / primate distinction seems much more marginal and second-order, in terms of impact on the cost function.
If most of the people in the dataset were black, I predict you wouldn't see this.
I don't know Facebook's TOS sufficiently to know whether they are using private groups as source material, but if you're utilizing bigoted content to train pattern recognition, you will replicate bigoted content.
The AI is not that smart and these examples show it.
Humans are primates. It's weird that it selected such a broad label, but it didn't select an incorrect label.
e: I assume something similar has been done before by training a model on brown/black bears then throwing polar bears at it. Anyone know the outcome?
When I was quite young, I referred to some firefighters as robots.
which says a lot about the state of our alleged human outperforming AI
And I'd like to see a gorilla in any pose that's really hard (for a human) to differentiate from a person.
The truth is: the recognition algorithm is not very sophisticated after all.
Primates and humans are similar labels. This was almost certainly not intentional. Video classifiers are going to make mistakes - sometimes crude or offensive ones. I don't get outrage over labeling errors like this. Facebook should fix the issue - but they shouldn't apologize. It only encourages grievance seekers.
In every aspect of your life
No, I think it's racist because racists have a long history of calling black people primates, and because an automated system doesn't get to escape scrutiny and critique just because someone didn't specifically put in a line of code that emulates the actions of racists.
I understand that fb is a much bigger scale, but all the reason to have a much more diverse set of eyes to test their models before they go live.
If you want to avoid this, hire more black people, seriously.
I guess first step might be to "hire more black QA people".
"Oh, maybe we should look into that"
AI models are deterministic in a purely technical sense, but practically speaking, they are non-deterministic black boxes. It’s not as if you can write a unit test which generates all possible videos of black people and makes sure it never outputs “gorilla”.
On the other hand, imagine a world where these labels were applied by a massive team of humans instead of a deep learning algorithm. At Facebook's scale, would the photos end up with more or less racist labels on average over time? My guess is that the model does a better job, but this is just another example of why we should be wary about trusting ML systems with important work.
One worries that the corporate overlords are preparing the legal system for completely impune manufacturers of self-driving cars. "Sorry your child is dead; the car did it so there's no one to sue or convict."
I would say it's both. It's embarrassing for Facebook because it looks racist even though it really isn't. The system might be emotionless but the people who interact with it aren't, and we don't expect them to be.
Instead, I want to talk about pareidolia. Humans are social creatures. We have evolved to identify others of our kind and read their expressions. This was important to us, as we evolved alongside gorilla analogues as well, and the few of us that couldn't discern one face from another didn't usually last long.
I think we're trying to place too much of a human expectation onto these machines. I think that human features and primate features are strikingly similar, and it's our specialized brains that let us so easily discern. Yes, with enough data and training we could have more accurate models, but we can't cry foul everytime an algorithm doesn't behave like a human does.
Reference: https://www.reddit.com/r/Pareidolia/
So, this is going to happen.
Humans with a lot of experience are. Would kids be? I once referred to firefighter as robots as a kid.
Please do not trivialize acts that have the potential to cut humans so deep with handwavy substantiations. Facebook should have known better, and done better.
When you have an automated system that has irregular behavior to a given input, we call that a bug. Bugs exist in all software, not always unique, but always present. This software is no different than any other. It will have errors. Because the software is categorizing faces, its errors will result in miscategorizing them. The only relevant questions to this are how frequent these errors are and how disparate they are across racial lines.
Another reference: this one is a Tool-Assisted Speedrun of a game that relies on basic image recognition software. While not entirely related, it does show how error-prone these algorithms can be. It's also fun to watch. https://youtu.be/mSFHKAvTGNk
Nobody likes the stories. No reasonable person is celebrating them. You’re not in disagreement with anyone.
These stories are about how we also deeply care about labels and categorization. Aren't we just looking at the natural selection (making them not "last long") of these way too rough AIs that step on bounderies that are pretty important to a lot of people ?
Oh well, it's the times we live in.
If people simply laughed at the results and fixed the problems they'd miss all the endorphin rush of outrage.
From what I can tell the only fix here is a hardcoded workaround outside the net, or a substantially more powerful architecture.
I think the conversation can be made a lot simpler.
AI isn’t ready for anything important. Done. That’s it. If one of the pioneers in the field can’t determine black peoples from primates - it isn’t ready for driving or war or legal matters or really anything of importance.
I think we (colloquial) made something kinda cool and jumped the gun on when and where to use it.
Facebook disabled Thai-to-English translation back in April because it translated the queen as “slut” and it’s been disabled since.
Maybe we should learn to accept non-fatal errors from applications instead of forcing things to stop entirely.
I find it ridiculous that my Photos app suggests I change monkey to “lemur” while I have plenty of photos of monkeys and zero of lemurs.
If you shine enough light on it, apparently the brand does. If a human were to do this, the company would immediately fire the employee and cut all ties with them. But as the article points out, 'fixing' an AI mistake isn't really a fix at all:
> [Google] said it was "appalled and genuinely sorry", though its fix, Wired reported in 2018, was simply to censor photo searches and tags for the word "gorilla".