A data flow process for confidential data and its application in a health research project

被引:4
|
作者
Crossfield, Samantha S. R. [1 ]
Zucker, Kieran [2 ]
Baxter, Paul [3 ]
Wright, Penny [2 ]
Fistein, Jon [1 ]
Markham, Alex F. [1 ,2 ]
Birkin, Mark [1 ]
Glaser, Adam W. [1 ,2 ]
Hall, Geoff [1 ,2 ]
机构
[1] Univ Leeds, Leeds Inst Data Analyt, Leeds, W Yorkshire, England
[2] Univ Leeds, Leeds Inst Med Res, St Jamess, Leeds, W Yorkshire, England
[3] Univ Leeds, Leeds Inst Cardiovascular & Metab Med, Leeds, W Yorkshire, England
来源
PLOS ONE | 2022年 / 17卷 / 01期
基金
英国医学研究理事会; 英国经济与社会研究理事会;
关键词
SECURE;
D O I
10.1371/journal.pone.0262609
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to define such an approach with clear steps for dataset creation, and to describe its utilisation in a case study linking healthcare data. Methods We developed a data flow protocol that generates pseudonymous datasets that can be reversibly linked, or irreversibly linked to form an anonymous research dataset. It was designed and implemented by the Comprehensive Patient Records (CPR) study in Leeds, UK. Results We defined a clear approach that received ethico-legal approval for use in creating an anonymous research dataset. Our approach used individual-level linkage through a mechanism that is not computer-intensive and was rendered irreversible to both data providers and processors. We successfully applied it in the CPR study to hospital and general practice and community electronic health record data from two providers, along with patient reported outcomes, for 365,193 patients. The resultant anonymous research dataset is available via DATA-CAN, the Health Data Research Hub for Cancer in the UK. Conclusions Through ethical, legal and academic review, we believe that we contribute a defined approach that represents a framework that exceeds current minimum standards for effective pseudonymisation and anonymisation. This paper describes our methods and provides supporting information to facilitate the use of this approach in research.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] FOREWORD: BIG DATA AND ITS APPLICATION IN HEALTH DISPARITIES RESEARCH
    Onukwugha, Eberechukwu
    Duru, O. Kenrik
    Peprah, Emmanuel
    ETHNICITY & DISEASE, 2017, 27 (02) : 69 - 72
  • [2] Sharing Confidential Data for Research Purposes A Primer
    Reiter, Jerome P.
    Kinney, Satkartar K.
    EPIDEMIOLOGY, 2011, 22 (05) : 632 - 635
  • [3] Application of access control model for confidential data
    Shan, Lumin
    Zhou, Huan
    Hong, Daocheng
    Dong, Qiwen
    Wang, Ye
    Song, Shubing
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 3865 - 3874
  • [4] PROVIDING ACCESS TO CONFIDENTIAL RESEARCH DATA THROUGH SYNTHESIS AND VERIFICATION: AN APPLICATION TO DATA ON EMPLOYEES OF THE US FEDERAL GOVERNMENT
    Barrientos, Andres F.
    Bolton, Alexander
    Balmat, Tom
    Reiter, Jerome P.
    de Figueiredo, John M.
    Machanavajjhala, Ashwin
    Chen, Yan
    Kneifel, Charley
    DeLong, Mark
    ANNALS OF APPLIED STATISTICS, 2018, 12 (02): : 1124 - 1156
  • [5] Research on Data Mining Service and Its Application Case in Complex Industrial Process
    Lu, Qi
    Lyu, Zhi-Jun
    Xiang, Qian
    Zhou, Yaqin
    Bao, Jinsong
    2017 13TH IEEE CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2017, : 1124 - 1129
  • [6] Confidential handling of data in secondary data research - Approaches to solving data concentration and data security problems
    Ihle, P
    Krappweis, J
    Schubert, I
    GESUNDHEITSWESEN, 2001, 63 : S6 - S12
  • [7] Sharing confidential health data for research purposes in the UK: where are 'publics' in the public interest?
    Sorbie, Annie
    EVIDENCE & POLICY, 2020, 16 (02): : 249 - 265
  • [8] Mobile application for data collection in health research
    Pereira, Irene Mari
    Bonfim, Daiana
    Ciqueto Peres, Heloisa Helena
    Goes, Ricardo Fernandes
    Gaidzinski, Raquel Rapone
    ACTA PAULISTA DE ENFERMAGEM, 2017, 30 (05) : 479 - 488
  • [9] Development of Data Ingestion Pipelines for the Federated Use of Biomedical Data in Research: The Health Big Data Project
    Reali, Pierluigi
    Carotenuto, Alessandro
    Piantella, Davide
    Tanca, Letizia
    Plebani, Pierluigi
    Signorini, Maria Gabriella
    2024 IEEE 22ND MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, MELECON 2024, 2024, : 678 - 683