My experience with working with hospital textual data is that, for the most part, it's either useless, or doesn't exist. The radiologist reading the image is expected to phone the specialist who requested the images to be red in order to figure out what to do with the image.
Hospital systems are atrocious for providing useful information anyways. They are often full of unnecessary / unimportant fields that the requesting side either doesn't know how to fill, or will fill with general nonsense just to get the request through the system.
It gets worse when it's DICOMs: the format itself is a mess. You never know where to look for the useful information. The information is often created accidentally, by some automated process that is completely broken, but doesn't create any visible artifacts for whoever handles the DICOM. Eg. the time information in the machine taking the image might be completely wrong, but it doesn't appear anywhere on the image, but then, say, the research needs to tell the patient's age... and is off by few decades.
Any attempt I've seen so far to run a study in a hospital would result in about 50% of collected information being discarded as completely worthless due to how it was acquired.
Radiologists have general knowledge about the system in which they operate. They can identify cases when information is bogus, while plausible. But this is often so much tied to the context of their work, there's no hope for there to be a practical automated solution for this any time soon. (And I'm talking about hospitals in well-to-do EU countries).
NB. It might sound like I'm trying to undermine your work, but what I'm actually trying to say is that the environment in which you want to automate things isn't ready to be automated. It's very similar to the self-driving cars: if we built road infrastructure differently, the task of automating driving could've been a lot easier, but because it's so random and so dependent on local context, it's just too hard to make useful automation.