The controls on this system as described by OpenSafely are far superior to the ones in Canada I am aware of, which some researchers leveraged the pandemic crisis to push through as a means to squeeze the data toothpaste out of the tube before technologies like FHE and LLMs matured enough create a more complete screen over individual records. The scheme in Canada removed some hard controls that prevented abuse and replaced them with discretionary controls without clear liability for their failure, I thought.
I am very cautious about the benefits of this kind of data aggregation because the idea that researchers have altruism or respect for individual privacy is a myth. There is also the risk that it moves governance of personal health information repositories out of government and into academia where there is no AtoI/FOIA requirements, no background checks on the people accessing the health data and systems, no legally binding mandates for limitations on use, and no clear role definition of the responsibilities of custodians, agents, providers, and other formal roles.
That said, OpenSafely's public logs could be an unbelievably good control against this, and it shows a level of stewardship and respect for public trust that mirrors the principles and ideals privacy professionals who have spent their careers on these problems hold.
I'm used to building for hostile environments and for huge userbases who are at least 3% criminal, and the only thing those people understand is consequences. So cautiously, congratulations, but if this data gets used for digital identity, social credit, domestic passports, restrictions on movement, association and other basic freedoms, that is on you.
As a general statement, this is incorrect: in the UK (where OpenSAFELY operates) Universities are considered "public authorities" under the Freedom of Information Act 2000, and people have a right to request access to information that they hold.
And nearly everything except the patient data is public anyway!
Also, the actual data is held by the GP system suppliers, who have their own FOIA obligations.
I'm interested in this point. While I get that an overreaching state could enable the other parts, is a digital identity really such a bad thing?
In the UK there is a Tax login, NHS login, Life event login, benefits login, passport login, driver's licence login, council login...
The organisations hold and share all the data anyway but the fear around a single password is confusing.
I am one of engineering team working on OpenSAFELY at the Bennett Institute.
Feel free to ask me any questions about the design or implementation of the system. We take patient privacy very seriously - it's something of a crusade for us!
Some more information
- all the researcher code is at https://github.com/opensafely
- all the platform code is at https://github.com/opensafely-core
- high level architecture overview talk by myself and my colleague Becky at PyCon UK '22 https://www.youtube.com/watch?v=L55mq5wi3Cc
EDIT: formatting
A small question, as I'm a bit tight on schedule this week to watch your PyCon talk (although I'll watch it this weekend or next week):
How does this work to import/receive patient data? Does it get HL7 messages or "just" imports raw databases?
Thanks!