[1] https://www.nytimes.com/2021/02/16/opinion/23andme-ancestry-...
no, it implies there is a signal in the dataset that could be something other than clinical. This means that until they can pinpoint the cause, or the thing the AI is detecting, all the other things it predicts are suspect.
ie if the AI thinks the subject is west african, then it might be more inclined to diagnose something related to sickle cell.
Or north western european woman in her mid 60s vs a japanese woman might get widly different bone density readings for the same level of "blob" (most medical imaging is divining the meaning of blobs and smears )
Similarly, if the train/test sets used here - for X-ray based diagnostics - using Machine Learning relies only on specific races, then the performance might be worse for other races, given that there's a new discriminatory variable in play.
The obvious solution here is to reduce bias by ensuring race is part of the dataset used for training and testing. Which, due to PII laws in play, may actually be quite challenging! Fascinating tradeoff imo.
Suppose AI #1 got a higher score on the training data and AI #2 had a more accurate diagnosis. Obviously you want #2 but if there is bias in the training data based on race and the AI has access to race then eventually you overfit into #1.
Also did they release their code and anonymized data? If not, it's impossible to tell if this is a bug.
If I got this result in my work, I would check it 10k times over because it defies belief. Even allowing subtle skeletal differences in different ethnic groups, the differences in this case are not in the bone and at least sometimes not visible to the human eye. Unless there is an undiscovered difference in radio-opacity across ethnicities, the result doesn't make sense.
Apparently this is a known and persistent affect across a variety of other medical images, tests, and scans. Not just for a "race" but for ethnic groups in general, as well as biological sex. So this might actually just be an "AI hit piece" that otherwise confirms an unpalatable but persistent and strong effect in the literature. The causes seem to be badly understudied, in part due of the obvious need for delicacy and respect around such topics.
This result is tremendously implausible to me, but I am finding quite a few articles documenting similar phenomena across things like retina scans and brain MRIs.
I’m more surprised that the distinguishing features haven’t been obvious to trained radiographers for decades. It would be cool to see a followup to this paper that identifies salient distinguishing features. Perhaps a GAN-like model could work—given the trained classifier network, train 1) a second network to generate images that when fed to the classifier, maximize the classification for a given ethnicity, and 2) a third network to discriminate real from fake X-Ray images (to avoid generating noise that happens to minimize the classifier’s loss function). I wonder if the generator would yield images with exaggerated features specific to a given ethnicity, or whether it would yield realistic but uninterpretable images.
Garbage research.
Is race a genetically distinct marker though? I guess if you limit the sample enough it is, but I've always thought of race as more of a continuous quality than a distinct one.
From the guidelines (https://news.ycombinator.com/newsguidelines.html):
"Please use the original title, unless it is misleading or linkbait; don't editorialize."
https://www.boston.com/news/health/2022/05/18/scientists-cre...
https://nationalpost.com/health/health-and-wellness/ai-can-t...
https://www.sciencealert.com/ai-can-predict-people-s-race-fr...
https://www.iflscience.com/technology/ai-can-identify-race-f...
https://www.bostonglobe.com/2022/05/13/business/mit-harvard-...
Just a small collection
https://arxiv.org/pdf/2011.06496.pdf
Compare the performance under high pass and low pass filters in this paper on CIFAR-10. Is it really the case that differentiating cats from airplanes is so much more fragile than predicting race from chest x-rays?
What voodoo have they unearthed?
That seems tautologically true.
Curious for the take not of a neuro-ophthalmologist. If they too are stumped, this may be a path to a deeper understanding our visual system.
Simple transformations obviously discernible to us blind computer vision. (CAPTCHAs.) There may be analogs for human vision which don’t present in the natural world. Evidence of such artefacts would partially validate our current path for artificial intelligence, as it suggests the aforementioned failures of our primitive AIs have analogs in our own.
It's a whole field of research, and it's pretty trivial to generate them for most classes of ML models. It's actually quite difficult to create robust models that DON'T have this problem...
There are lots of known ways in which people of different races are different physiologically. Probably even more unknown ways.
There could also be differences in imaging technology used in different communities, as others have suggested. I'd be a bit surprised if something like that could create such a strong signal but it's on the table.
Why would a model rely on its ability to detect racial identity to make decisions?
What kind of errors are race-specific?
If the AI is also implicitly learning to detect race from the images, it's going to learn an association that people of race X usually have tumors and people of race Y usually do not.
The problem here is that the people training the model and the clinical radiologists interpreting data from the model may not realize that race was a confounding factor in training, so they'll be unaware that the model may make racial inferences in the real world data.
If people of race X really do have a higher incidence rate for a specific type of cancer than race Y, maybe this is OK. But if the issue is that there was bias in the training/validation data set that was unknown to the people building the model, and in the real world people of race X and race Y have exactly the same incidence rate for this type of cancer, then this is going to be a problem because it's likely to introduce race-specific errors.
See e.g. https://www.ucsf.edu/news/2021/09/421466/new-kidney-function...
And the intention is for melanin to block x-rays too, block all rays, not just UV but deeper. Well it has a spectrum, that cannot be denied. And if you're taking all the pixels in an image, there might be aggregate effects as I described. You get a few million pixels, let AI use every part of the buffalo of the information of the picture, and you can get skin color through x-rays.
The question is what this says about Africans with light-skin strictly because of albinism, ie lack of pigmentation, but otherwise totally African.
Frankly, even a freshly arrived alien from Mars or Titan could easily tell Icelanders, Mongols and Xhosa apart, without knowing anything about our culture. The fact that there has been a lot of interbreeding/admixture since the Age of Sail began, does not mean that there aren't meaningful biological differences between the original groups, which still obviously exist.
An analogy: much like the existence of twilight does not render the concept of night and day a 'social construct' either. We attach certain social meanings to those natural phenomena, and a 'working day' can easily stretch into 'astronomical night' (all too often!), but that does not mean that 'night' and 'day' do not exist outside of our cultural reference framework.
There is a social concept of 'race' which corresponds to the 'working day' concept in this analogy, e.g. 'BIPOC', claiming Asians as 'white adjacent' or classifying North Africans or Jews as 'white', even though they may not necessarily look white. But this is almost certainly not what the AI identified. This social concept of race would confuse a Martian alien unless he started to study the social and racial history of the U.S., and possibly even afterwards. It definitely confuses me, a random observer from Central Europe.
The social definition is used because that's a most scientifically meaningful and useful definition that avoids many of the issues with biological race realism.
Nations are uncontroversially recognized as a social constructs. However I'm certain that AI could also detect images taken outdoors in Mexico vs those in Finland. Additionally I, a US citizen, cannot simply declare that I am now a citizen of France and expect to get a French passport.
However it also means that what a nation is, is not set in stone for eternity. It means that different people can debate about the precise definitions of about what defines a particular nation. It means that Czechoslovakia can become the Czech republic and Slovakia. It means that not everyone agrees if Transnistria is an independent nation. It means that the EU can decide that a German citizen can have the same passport as a French citizen.
As a more controversial example, this is also the case when people talk about gender being a "social construct". It doesn't mean that we can simply pretend like the ideas "men" and "women" doesn't exist (as people both declare and fear). But it does mean there is some flexibility in these terms and we as a society can choose how we want these ideas to evolve.
Society is a complex and powerful part of our reality, arguably more impactful on us from day to day than most of physics (after all we did survive for hundreds of thousands of years without even understanding the basics of physics). Therefore something being classified as a "social construct" doesn't mean it "isn't real". Even more important is that individuals cannot choose who social construct evolve. I cannot, for example, declare that since taxes are a social construct, I'm not paying them anymore. We can however, as a society, change what and how these constructs are interpreted.
Race picks specific and arbitrary differences , for example hispanic is a different race in US society but black and white based on skin color are as well, indians and east asians are also one "race".
Ethnicities are not social constructs but race is. The AI finds ethnic differences and correlates them with self-percievied social/racial classification.
"Race" as the evil social construct it is, takes ethnic differences and intrprets them to mean some ethnicities are different races of humans than others as in not just different ancestors but differently created or evolved despite all evidence and major religion saying all humans are one species (homosapiens) that have a common homosapien ancestor.
I thought all this was obvious but the social climate recently is very weird.
From national geographic: “Race” is usually associated with biology and linked with physical characteristics such as skin color or hair texture. “Ethnicity” is linked with cultural expression and identification.
https://www.healthit.gov/isa/taxonomy/term/741/uscdi-v2
https://www.healthit.gov/isa/taxonomy/term/746/uscdi-v2
(I'm not claiming that this is an optimal approach, just pointing out how it works in most software today.)
Let the social "culture war" rage on. The only war I see going on in the west (U.S. mostly) is a _lack_ of culture.
In fact it can be medically harmful to think this way.
They discourage using race as a source of any physiological signal. They do allow using genetics, but the relevant situations are the many many ones where genetic testing isn't possible or doesn't yet provide useful signal.
Unaccountable institutions get captured very easily, and the race cult that's swept through our educated class has been a very powerful one.
[1] https://www.ama-assn.org/press-center/press-releases/new-ama...
An interesting question in the U.S. is "who is considered white?" There was a Supreme Court case in which someone who was literally from the Caucasus was ruled not white. This is why it's sociological, not scientific.
https://www.sceneonradio.org/episode-40-citizen-thind-seeing...
To give a contrived example; if I say people with ring fingers over 3 inches long are Longfings and people wkth ring fingers 3 inches or less are Shortfings, and then out society treats people differently based on being Longfing or Shortfing, this is a social construct that is causing problems for people based on a contrived criteria that has no real meaning. The same is true of race.
Sure, it’s possible that bias due to the radiographer is the culprit, but this seems unlikely.
This is a very contrived way to say that people share characteristics with other people. The real question is why people don't say that I belong to the six-foot tall bad-knees race.
subspecies are found across species-- they happen based on geographic dispersion and geographic isolation, which humans underwent for tens and hundreds of thousands of years.
Welcome to the sciences of anatomy, anthropology, and forensics.
other differences:
- slow twitch vs fast twitch muscle
- teeth shape
- shapes and colors of various parts
- genetic susceptibility to & advantages against specific diseases
Just like Darwin's finches of the Gallapogos, humans faced geographic dispersion resulting in genetic, diet (e.g. hunter-gatherer vs farmer & malnutrition), and geographical (e.g. altitude) differences which over the course of millennia affect anatomical differences. We can see this effect across all biota: bacteria, plants, animals, and yes, humans.
help keep politics out of science.
£10 says that its not that. Anatomy is extraordinarily hard, and AI isn't that good, yet. Sure different races have different layouts, but often that's only really obvious post mortem. (ie when you can yank out the bones and look at them, there are of course corner cases where high res CAT/MRI scans can pull out decent skeletal imagery in 3D) There are other cases, but that should be easy to account for.
If I had to bet, and I knew where the data was coming from, I'd say its probably picking up on the style of imaging, rather than anything anatomical. Not all x-rays have bones in, and not all bones differ reliably to detect race.
> keep politics out of science.
Yes, precisely, which is why the experiment needs to be reproduced, and theories tested through experimentation. The reason why this is important is because unless we workout where this trait is coming from, we cannot be sure the diagnosis is correct. For example those with sickle cells have a higher risk of bone damage[1] which could indicate they are x-rayed more. This could warp the dataset, causing false positives for sickle cell style bone damage.
[1]https://www.hopkinsmedicine.org/health/conditions-and-diseas...
This was my guess as well. I've spent a lot of time around radiology and AI (I used to work at a company specializing in it) and we read a lot of the failure cases as well. There was one example where the model picked up on the hospital, and one hospital was for higher risk patients- so it learned to assign all patients from that hospital to the disease category simply because they were at that hospital.
There are a ton of cases like this out there, especially when using public datasets (which in the medical field tend to be very unbalanced datasets due to the difficulties of building a HIPAA compliant public dataset).
Certainly possible! They do control for hospital and machine …
>Race prediction performance was also robust across models trained on single equipment and single hospital location on the chest x-ray and mammogram datasets
… but it’s also possible that different chest x-rays were being used for different diagnostic purposes and thus have a different imaging style, which a) may correlate with ethnicity and b) does not appear to be explicitly controlled for.
>"We found that deep learning models effectively predicted patient race even when the bone density information was removed for both MXR (AUC value for Black patients: 0·960 [CI 0·958–0·963]) and CXP (AUC value for Black patients: 0·945 [CI 0·94–0·949]) datasets. The average pixel thresholds for different tissues did not produce any usable signal to detect race (AUC 0·5). These findings suggest that race information was not localised within the brightest pixels within the image (eg, in the bone)."
Our tools are so precise you can tell which parent a set of cousins had with DNA tests, this doesn't make them a different species/sub-species or race from each other, even if one group has red hair and the other has black.
It's the pointless lumping together of people who are genetically distinct and drawing arbitrary, unscientific lines that's the issue.
Presumably the same experiments that can detect Asian Vs Black Vs White could also detect the entirely made up 'races' of Asian orBlack, AsianorWhite and WhiteorBlack since those are logically equivalent.
So are the races I made up a moment ago real things? No. But a computer can predict which category I'd assign, doesn't that make them real and important racial classifications? No it means my made up classifications map to other real genetic concepts at a lower level, like red hair.
Which came as a surprise to the ophthalmologists, because they aren't aware of any significant differences between male and female retinas.
[0] https://www.researchgate.net/publication/351558516_Predictin...
>Race prediction performance was also robust across models trained on single equipment and single hospital location on the chest x-ray and mammogram datasets
Sure, it’s possible that bias due to the radiographer is the culprit, but this seems unlikely.
Ordinary computer vision can also identify race fairly accurately, the high pass filter thing is merely pointing out that ML classifiers don't work like human retinas.
It's astonishing how many epicycles HN comments are trying to introduce into a finding that anyone would have predicted. Research which confirms predictable things is valuable of course, but no apple carts have been upset.
Bone density, micro fractures and deviations in shape. The mongols had famously had bowed legs from spending a majority of their waking lives on horseback.
I think this is the point a lot of people are missing; they think, "So what if 'black' correlates to unhealthy and the model notices? It's just seeing the truth!"
However, I'm still wondering how this incorrectness works; can anyone explain?
Edit: Clue: The AI is predicting self-reported race, and the authors indicated that self-reported race correlates poorly to actual genetic differences.
I read once that a radiologist can't always explain what they see in an image that leads them to one diagnosis or another, they say that after seeing many of them they just know.
So I suspect the same could be done for race. This would be a super interesting thing to try with some college students - pay them to train for a few days on images and see how they do.
It doesn't seem like noise in the images is a factor
Maybe this needs to be updated from physicists: https://xkcd.com/793/