01.06.12
The power of linked health data
Source: National Health Executive May/June 2012
Unprecedented progress in health research is on the agenda, thanks to the official launch of the new CPRD, the Clinical Practice Research Datalink, within the MHRA. NHE spoke to the director of the CPRD, John Parkinson.
The new Clinical Practice Research Datalink is a world-class e-health secure research service that will provide a huge boost to medical and pharmaceutical researchers, improve public health, help the UK life sciences industries, and play a part in wider Government plans for growth.
Its director, John Parkinson – who previously managed the GPRD, which had a similar function within primary care – told NHE that the really exciting thing about the CPRD, which launched at the end of March, was the ability for linkages between datasets on a large scale. He also explained more about the extensive anonymisation and privacy technologies being used to protect patient confidentiality, and its role in aiding clinical trials.
More research and better research
Parkinson told NHE: “The NHS is an absolutely fabulous resource in terms of research. We have all this good data, which is in the main of very good quality; there is the NHS number, which we use for the linkages – but we never see that number – which ensures we can link all this data together; we have the primary care system, where you can only be registered to one GP. There are so many advantages for researchers of working inside the UK and NHS framework.
“It is our task to maximise the use of this resource for research – research that’s properly protocolised, and where that protocol is approved by the Independent Scientific Advisory Committee (ISAC) of the MHRA.”
Any group or person who can provide an ISACapproved protocol will have their research project enabled, as long as they are, or work for, an organisation that the CPRD can have a legal contract with. The computers on which the researchers want to work have a certificate put on them allowing the relevant data to reside there, and they have to work with an RSA key – a code that changes every 30 seconds.
Parkinson said: “That means we know the organisation, we know the computers, we know the researchers. They also need to prove they are able to undertake the work they have set out in the protocol.
“The NHS data, through us, is such a fantastic repository and resource that there aren’t enough researchers in the UK to do all this work, so it’s our task to maximise the global power of researchers. We will have contracts with pharmaceutical companies in America and Europe, academics in Europe and Canada and the US; but importantly, our largest group will be academics in the UK, and the pharma companies and devices companies in the UK.”
Fees and feasibility
A fees structure applies to help fund the work of the CPRD, which will employ nearly 60 people when Parkinson finishes the recruitment process and which will eventually have to be self-sustaining financially. It also helps ensure it complies with European law on government subsidies and competition, considering there is also a global market for health data.
Academics will get access at roughly half the price of commercial companies, he said.
He went on: “The payments are based on data volumes and the amount of linkage we have to do. We might give a small discount for a whole range of services.
“Where there’s linked data, that’s provided specifically by us against a specific project. We do make certain of the data available online. But this is only to known organisations, known people, known computers, and with very powerful keys, about the movement of information. But, that has an additional advantage, actually, because every query they make on the data, we log here and we can audit. We can see exactly the queries that are being run. They can’t do anything with the data without an ISAC approval.
“The advantages of that are because you need to be able to have a preliminary look at the data: are there enough patients for a statistically significant result? We’re looking at thousands of projects a year, in the medium term – they need some central access to the system to do that.”
Linking the data
He said: “Our remit at CPRD is to pull together and link, on an anonymised person level, as many datasets that can benefit research projects as possible.
“We have the discharge summary, coded inside the primary care record: we want to link that to the hospital episode statistics data. That provides much more detail at ward level of what’s gone on when the patient has been transferred from ward to ward, both in terms of ICD codes and procedural OPCS-4 codes.
“Naturally, because quite a lot of the work is involved with medicine safety, and outcomes, we are linking with the central mortality data, because although GPs record about 90% of the causes and dates of death, we want to completely fill that in: it’s very important.
“We are also going to be linking with as many of the audit datasets as possible.”
The data controller for those datasets, although they are run from the DH, is HQIP, the Healthcare Quality Improvement Partnership. That then contracts the running of each dataset to other organisations.
When NHE spoke to Parkinson in late May, he said he had “just come off the phone” with the ONS about gaining access to the full birth record too, with the aim of creating a ‘pregnancy data-mart’, linked with the congenital abnormalities register, and then the creation of a similar mart of data for everything in the birth to age 18 period.
He said: “These are linkages which are going to be of benefit to lots of potential researchers, so we will create almost semi-permanent links to help that.
“A lot of researchers get access to specific datasets: for cancer researchers, for example, there are the cancer registries and central cancer data. But with many cancers these days, you have treatment and you have a long life after that. You may have another cancer at a later date, but the real way to follow that up is by linking with the primary care record.
“In the same way, there are six or seven cardiovascular datasets, one called MINAP (the Myocardial Ischaemia National Audit Project). That is a very detailed, highly granular dataset about what happens if you’re picked up by ambulance and taken to hospital with a MI – but having got over your MI, you will be released from hospital on a series of drugs back into primary care, and the way to follow those patients up in the longer term is again back into the GP database. You’ll only go back into the MINAP database if there’s a secondary heart attack.
“We’re adding a dimension which really extends the way we can follow up patients in the longer term, and ensure that all the care that’s being given is leading to the outcomes we expect.”
Making it public
The CPRD will expect all the research that comes out of the linkages it makes available to put into the public domain, preferably peerreviewed journals when possible.
He said: “That’s how the NHS gains. It doesn’t matter who funds the research: a commercial company, an academic somewhere else in the world, or a UK academic, the NHS gains benefit from all of this research by publication of this in an open way. It improves clinical guidelines for the future, and patients really benefit in both the short and medium term.”
Anonymity
Clearly privacy concerns are paramount, with the CPRD acknowledging that multiple linkages of personal data increase the chances of identity disclosure. But the technologies involved in keeping the data secure and secret are very sophisticated.
Parkinson noted that technically, under the Data Protection Act, the release of anonymised data does not require specific patient consent. The CPRD will continue the system developed by the GPRD where GP practices put up a poster, and a message on their website, explaining that the surgery provides anonymous data for research. Any patient who does not want their data used can let the GP know, a ‘flag’ will be put in their record, and the data will not be used.
Parkinson said that in seven years of using that system at the GPRD, the number of opt-outs are “incredibly small” – a few per practice. He explained that the data leaves the GP practice with a keyed patient ID, encrypted over the N3 network, with the key held in the practice and unknown to the CPRD. The CPRD then applies a second key that changes the number again.
Checks and linkages are then done via the NHS number, which again never travels to the CPRD.
Parkinson said: “That travels to our trusted third party, the Health & Social Care Information Centre, the body that has the NHS number files for the whole NHS. They create the linkages for us to other data, and return the data to us without ID.
“The more you link, the more potentially disclosive it is: we don’t try to avoid saying that. But we have charters with people who provide us data to describe what we’re going to do with the data and how, which gives them confidence in releasing it to us. We have powerful privacy enhancing technologies at each stage.”
He said even he, as director, does not know where in London their data centre is – he just knows what he needs to know, which is that it is incredibly secure: the highest possible level of security, he said.
The CPRD also has a ‘right of audit’ over all the researchers it releases data to, and has multiple organisations looking over its shoulder, such as the Information Commissioner’s Office, the National Information Governance Board for Health and Social Care, and the Trent ethnics committee.
The MHRA connection
The NHS reforms have changed the nature of the MHRA, which from April 2013 will have three parts: the medicines and device regulation function, the CPRD, and a biological standards and control function.
The MHRA has had a research connection for the last 12 years through its management of the GPRD, and the organisation also contributed to the original UK Clinical Research Collaboration (UKCRC) report and to the subsequent NIHR Research Capability Programme (RCP) pilot, which showed clearly how the availability of more population based data and more linked data will have a large beneficial effect on both observational and interventional research. Parkinson said: “It was decided between the MHRA and DH and NIHR that the location of the service inside the medicines and devices regulator is a very good place for it: so much of the use of the data is around medicines and devices, and there are other uses related to disease epidemiology and better understanding of rare diseases.”
The CPRD will be of particular benefit to doctors and researchers dealing with rare diseases, Parkinson noted. He gave an anecdote of a conversation with a doctor interested in the data available on a “really quite rare” disease. Parkinson said: “I said ‘let’s give it a go’ – using anonymised data, I ran it, and in a few seconds, we had 87 cases. He said: ‘The power of that!’ Traditionally, it’s very hard to do research on rare diseases, because you can’t get access to all the patients. But we have a mechanism, through anonymisation, where we know which practice they’re in, so we would write a letter to the GP saying there’s a potential project on a rare disease, we’re aware you have one or two patients with this, are you willing to be part of it? If they are, it’s up to them to make contact with the patient and ask. What we find with rare diseases is that people often to.”
Improving clinical trials
That is one example of the CPRD’s remit to improve the way clinical trials are conducted. Its powerful query tools and access to so many linked datasets mean it will be able to ensure trials have enough patients, and that the inclusions / exclusions criteria have been set correctly. Parkinson said that often, a “subtle change” of a few percentage points in those criteria can mean 10-15% more patients.
He said: “The UK has lost a little bit of traction in the last four or five years in terms of the number of trials of new medicines. These are global trials – pharma is global – and in conjunction with the NIHR we’re part of the quest to ensure more medicines are tested in the UK, which gives patients who want to be part of it an opportunity, and ensures, if the data is right, that those drugs get licensed in the UK.”
The CPRD is also working on better linkages between clinical trial datasets and patients’ electronic health records. The benefits run both ways: it ensures that useful historical medical information and granularity is available to those running the trial, and ensures that any health events during the trial are registered in the EHR. Eventually, Parkinson said, he hoped the CPRD would spend time talking to organisations like NHS Evidence about how it can help improve clinical care guidelines.
Tell us what you think – have your say below, or email us directly at [email protected]