A data flow process for confidential data and its application in a health research project

被引:4
|
作者
Crossfield, Samantha S. R. [1 ]
Zucker, Kieran [2 ]
Baxter, Paul [3 ]
Wright, Penny [2 ]
Fistein, Jon [1 ]
Markham, Alex F. [1 ,2 ]
Birkin, Mark [1 ]
Glaser, Adam W. [1 ,2 ]
Hall, Geoff [1 ,2 ]
机构
[1] Univ Leeds, Leeds Inst Data Analyt, Leeds, W Yorkshire, England
[2] Univ Leeds, Leeds Inst Med Res, St Jamess, Leeds, W Yorkshire, England
[3] Univ Leeds, Leeds Inst Cardiovascular & Metab Med, Leeds, W Yorkshire, England
来源
PLOS ONE | 2022年 / 17卷 / 01期
基金
英国医学研究理事会; 英国经济与社会研究理事会;
关键词
SECURE;
D O I
10.1371/journal.pone.0262609
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to define such an approach with clear steps for dataset creation, and to describe its utilisation in a case study linking healthcare data. Methods We developed a data flow protocol that generates pseudonymous datasets that can be reversibly linked, or irreversibly linked to form an anonymous research dataset. It was designed and implemented by the Comprehensive Patient Records (CPR) study in Leeds, UK. Results We defined a clear approach that received ethico-legal approval for use in creating an anonymous research dataset. Our approach used individual-level linkage through a mechanism that is not computer-intensive and was rendered irreversible to both data providers and processors. We successfully applied it in the CPR study to hospital and general practice and community electronic health record data from two providers, along with patient reported outcomes, for 365,193 patients. The resultant anonymous research dataset is available via DATA-CAN, the Health Data Research Hub for Cancer in the UK. Conclusions Through ethical, legal and academic review, we believe that we contribute a defined approach that represents a framework that exceeds current minimum standards for effective pseudonymisation and anonymisation. This paper describes our methods and provides supporting information to facilitate the use of this approach in research.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] DATA PROCESSING FOR A RESEARCH PROJECT
    MCLEOD, GR
    MEDICAL JOURNAL OF AUSTRALIA, 1969, 2 (21) : 1066 - &
  • [22] Research on Practice and Application of Big Data Mining Technology in Project Management
    Jia, Yanli
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON EDUCATION TECHNOLOGY AND ECONOMIC MANAGEMENT, 2015, 22 : 160 - 166
  • [23] Enriching Data Science and Health Care Education: Application and Impact of Synthetic Data Sets Through the Health Gym Project
    Kuo, Nicholas I-Hsien
    Perez-Concha, Oscar
    Hanly, Mark
    Mnatzaganian, Emmanuel
    Hao, Brandon
    Di Sipio, Marcus
    Yu, Guolin
    Vanjara, Jash
    Valerie, Ivy Cerelia
    Costa, Juliana de Oliveira
    Churches, Timothy
    Lujic, Sanja
    Hegarty, Jo
    Jorm, Louisa
    Barbieri, Sebastiano
    JMIR MEDICAL EDUCATION, 2024, 10
  • [24] Application Research of Data Mining Technology in Postgraduate Culture Process
    Xue Qiang
    Ma Shijin
    Xu Zhengming
    ICAIE 2009: PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND EDUCATION, VOLS 1 AND 2, 2009, : 716 - 719
  • [25] Precision Medicine and Big Data: The Application of an Ethics Framework for Big Data in Health and Research
    Schaefer, G. Owen
    Tai, E. Shyong
    Sun, Shirley
    ASIAN BIOETHICS REVIEW, 2019, 11 (03) : 275 - 288
  • [26] Research on the Data Warehouse in the Design Process of the Database Application System
    Ma, Guojun
    PROCEEDINGS OF THE 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS, CONTROL AND ELECTRONIC ENGINEERING (ICCMCEE 2015), 2015, 37 : 50 - 54
  • [27] Efficient database and web service design for confidential patient data in the TEMONICS project
    Meier, H.
    Alich, I.
    Flick, H.
    Kotterba, B.
    4TH EUROPEAN CONFERENCE OF THE INTERNATIONAL FEDERATION FOR MEDICAL AND BIOLOGICAL ENGINEERING, 2009, 22 (1-3): : 998 - 1001
  • [28] APPLICATION OF HEALTH EFFECTS DATA TO CHEMICAL PROCESS ACCIDENTS.
    Holton, Gregory A.
    Montague, David F.
    Plant/operations progress, 1988, 7 (03): : 204 - 208
  • [29] Data mining & its research on cold rolling mills process
    Wu, Shengxi
    Liu, Xinggang
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 7788 - 7791
  • [30] About Leaks of Confidential Data in the Process of Indexing Sites by Search Crawlers
    Kratov, Sergey
    PERSPECTIVES OF SYSTEM INFORMATICS (PSI 2019), 2019, 11964 : 199 - 204