My suspicion is that the concern with machine learning over racism is rooted in two things. The first is just the general modern trend of accusing anything you don't like of being racist, because everybody hates racism and wants to fight it. And the second is the fear on the part of people who make a living fighting racism that machine learning might actually put them out of a job.
Because machine learning is basically a paperclip optimizer. You tell it to maximize a thing, it maximizes the thing and minimizes everything else. Racism isn't paperclips, so the paperclip optimizer will optimize for smashing it in favor of making more paperclips. And then they're out of business.
Because when you look at the criticism of this stuff, it generally looks like this. ~12% of the population is black, only ~5% of the selected applicants are black, the algorithm is accused of racism.
But nothing is that simple, because all kinds of things like income and education level and so on correlate with race, so you have to take all of those things into account before you can tell what's going on. And taking into account all of the available data is how machine learning works.
Which isn't to say that you couldn't make an algorithm racist. Tell it to optimize for applicants with a particular skin color and it does. But then your problem isn't with the algorithm, it's with the jackasses who asked for that.
What to optimize for is a much more general and difficult question. (Hint: Not paperclips.)
I don't get to how you go from this statement, to then again explaining exactly how racism is embedded in algorithms. By using the biased data we have in the real world...
To fix that you have to cause more black high school students to go to college and study computer science and then wait two generations until their proportionality in the installed base of qualified computer scientists reaches parity. There is no magic wand that makes it happen overnight.
But concentrating on the places where it can't be solved instead of the places where it can will make it take even longer.
Likewise, if the system is trained to duplicate human decision-making (like who gets loans), interesting things can happen: if the decision-makers unconsciously favored whites over blacks, the algorithm could wind up weighing skin color or stereotypically Black or Latino names negatively, meaning that the final model is explicitly racist, just because there is a correlation in the training data. That doesn't mean we shouldn't use deep learning, it means that it's not responsible to just fit the training data and ship without testing for such problems.
This isn't racism at all. It's just bad PR because humans take the implication that calling black people monkeys is calling them stupid, since that's the implication you would draw if a person did that.
An algorithm doing that is just recognizing that humans and gorillas are both primates:
http://www.aquilaarts.com/bushmonkey.html
And then it's a bug, in the same way that recognizing a black balloon as a balloon but a white balloon as a light bulb is a bug. It has nothing to do with race at all. The algorithm isn't racist against white balloons. The solution is a general increase in the amount of training data, which is what you want in all cases regardless.
> if the decision-makers unconsciously favored whites over blacks, the algorithm could wind up weighing skin color or stereotypically Black or Latino names negatively, meaning that the final model is explicitly racist, just because there is a correlation in the training data.
Except that this is exactly the thing that a paperclip optimizer will smash to bits because it interferes with the goal of making more paperclips.