If the AI is also implicitly learning to detect race from the images, it's going to learn an association that people of race X usually have tumors and people of race Y usually do not.
The problem here is that the people training the model and the clinical radiologists interpreting data from the model may not realize that race was a confounding factor in training, so they'll be unaware that the model may make racial inferences in the real world data.
If people of race X really do have a higher incidence rate for a specific type of cancer than race Y, maybe this is OK. But if the issue is that there was bias in the training/validation data set that was unknown to the people building the model, and in the real world people of race X and race Y have exactly the same incidence rate for this type of cancer, then this is going to be a problem because it's likely to introduce race-specific errors.
No comments yet.