undefined | Better HN

0 pointscodefreeordie3y ago0 comments

Presumably the model would actually be biased on race in accurate ways, if it found the correlation itself

0 comments

I could be entirely wrong here, so if you've got more context in this area by all means correct me.

Consider an "AI" that rates the probability of recidivism for prisoners nearing their parole date. That score would then be presented to the parole board, and taken into consideration in determining whether or not to grant parole. If this AI were accidentally/incidentally accurately determining the race of the prisoner, then the output score would take that into account as well. Black men have a recidivism rate significantly higher than other groups[1]. The reasons for the above aside - it's a complex topic, and outside the scope of this analogy - this is extremely undesirable behavior for a process that is intended to remove human biases.

You might then ask, how does this relate to medical imaging? Medical decisions are regularly made based on the expected lifespan of the individual. It makes little sense to aggressively treat leukemia in a patient who is currently undergoing unrelated failure of multiple organs. Similarly it would likely make sense for a healthy 30-year-old to undergo a joint replacement and associated physical therapy, because that person can reasonably be expected to live for an additional 40 years while the same treatment wouldn't make sense for a 70-year-old with long-term chronic issues. This concept is commonly represented as "QALY" - "quality-adjusted life years".

Life expectancy can vary significantly based on race[2].

An AI that evaluates medical imagery that considers QALY in providing a care recommendation may result in a positive indicator for a white hispanic woman and a negative indicator for a black non-hispanic man, with all else being equal and with race as the only differentiator.

In short - it's not necessarily a bad thing for a model to be able to predict the race of the input imagery. The problem is that we don't know why it can do so. Unless we know that, we can't trust that the output is actually measuring what we intend it to be measuring.

1: https://prisoninsight.com/recidivism-the-ultimate-guide/ 2: https://www.cdc.gov/nchs/products/databriefs/db244.htm

codefreeordieOP3y ago

At the risk of discussing sensitive topics on a platform ill-suited:

If, in your hypothetical recidivism case, an AI "accurately" determined that a pattern of higher recidivism-related features was correlated to race, and was able to determine "accurately" that the specific subset of recidivism-related features predicted race, why would it be wrong to make parole decisions using those recidivism-related features?

pessimizer3y ago

Because both the original conviction and any recidivism is determined through the decision-making of people who are aware of race and racial stereotypes. The AI would just be laundering the decisions you were already making, not improving them.

edit: imagine I was a teacher who systematically scored people with certain physical characteristics 10% lower than people who didn't have them. Let's say, for example, that I was a stand-up comedy teacher that wasn't amused by women.

If I used an AI trained on that data to choose future admissions (assuming plentiful applicants), I would end up with an all-male class. If this happened throughout the industry (especially noting that the all-male enrollment that I have would supply the teachers of the future), stand-up comedy would simply become a thing that women were seen as not having the aptitude to do, although nobody explicitly ever meant to sabotage women, just to direct them into something that they would have a better chance to succeed in.

gadflyinyoureye3y ago

If you decided on race, in this instance, you would be making people much more deterministic as a result of the power of race. Race is too broad a concept to reliably say that all white people are at X chance of recidivism. Instead we want to know if Marlowe is at risk of high recidivism based on her character.

1 more reply

amarshall3y ago

Maybe, maybe not. Hard to say—which is the problem they call out in the paper

> efforts to control [model race-prediction] when it is undesirable will be challenging and demand further study

codefreeordieOP3y ago

The correlation being "undesirable" to the individuals doing the research does not mean that the correlation is inaccurate.

I mean, sure, there are tons of ways for garbage data to sneak into ML models -- though these guys tried pretty hard to control for that -- but if the model actually determined that "race" is a meaningful feature, then that might be because it is, and science should be concerned with what is, not with what we wish were.

amarshall3y ago

If one believes and proclaims that they have controlled for variable X, but they haven’t actually done so, then their results and analysis may well be invalid or misleading because of that. Whether they actually should have controlled for X or not is orthogonal.

1 more reply

j / k navigate · click thread line to collapse