Distributed Analytics on Sensitive Medical Data: The Personal Health Train

被引:61
|
作者
Beyan, Oya [1 ,2 ]
Choudhury, Ananya [3 ]
van Soest, Johan [3 ,4 ]
Kohlbacher, Oliver [5 ,6 ,7 ,8 ]
Zimmermann, Lukas [7 ]
Stenzhorn, Holger [7 ]
Karim, Md Rezaul [1 ,2 ]
Dumontier, Michel [4 ]
Decker, Stefan [1 ,2 ]
Santos, Luiz Olavo Bonino da Silva [9 ]
Dekker, Andre [3 ]
机构
[1] Fraunhofer Inst Appl Informat Technol FIT, D-53754 St Augustin, Germany
[2] Rhein Westfal TH Aachen, D-52056 Aachen, Germany
[3] Maastricht Univ, GROW Sch Oncol & Dev Biol, Dept Radiat Oncol MAASTRO, Med Ctr, NL-6200 MD Maastricht, Netherlands
[4] Maastricht Univ, Inst Data Sci, Univ Singel 60, NL-6229 ER Maastricht, Netherlands
[5] Univ Tubingen, Dept Comp Sci, D-72076 Tubingen, Baden Wurttembe, Germany
[6] Univ Tubingen, Quantitat Biol Ctr, D-72076 Tubingen, Baden Wurttembe, Germany
[7] Univ Tubingen, Inst Translat Bioinformat, D-72076 Tubingen, Baden Wurttembe, Germany
[8] Univ Tubingen, Ctr Bioinformat, Tubingen, Germany
[9] Go FAIR Int Support & Coordinat Off GFISCO, Leiden, Netherlands
关键词
Distributed analytics; Data reuse; FAIR; Health data; Ethics and privacy;
D O I
10.1162/dint_a_00032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, as newer technologies have evolved around the healthcare ecosystem, more and more data have been generated. Advanced analytics could power the data collected from numerous sources, both from healthcare institutions, or generated by individuals themselves via apps and devices, and lead to innovations in treatment and diagnosis of diseases; improve the care given to the patient; and empower citizens to participate in the decision-making process regarding their own health and well-being. However, the sensitive nature of the health data prohibits healthcare organizations from sharing the data. The Personal Health Train (PHT) is a novel approach, aiming to establish a distributed data analytics infrastructure enabling the (re)use of distributed healthcare data, while data owners stay in control of their own data. The main principle of the PHT is that data remain in their original location, and analytical tasks visit data sources and execute the tasks. The PHT provides a distributed, flexible approach to use data in a network of participants, incorporating the FAIR principles. It facilitates the responsible use of sensitive and/or personal data by adopting international principles and regulations. This paper presents the concepts and main components of the PHT and demonstrates how it complies with FAIR principles.
引用
收藏
页码:96 / 107
页数:12
相关论文
共 50 条
  • [31] Distributed Big Data Analytics in Service Computing
    Yu, Weider D.
    Gottumukkala, AvinashChander
    Senthailselvi, Deenash Arivazhagan
    Maniraj, Prabhu
    Khonde, Tushar
    [J]. 2017 IEEE 13TH INTERNATIONAL SYMPOSIUM ON AUTONOMOUS DECENTRALIZED SYSTEMS (ISADS 2017), 2017, : 55 - 60
  • [32] Data Analytics Algorithm Benchmark on Distributed Systems
    Hamid, Mohd Hakim Abdul
    Abu, Nur Azman
    Mohamad, Siti Nurul Mahfuzah
    Idris, Ariff
    Zakaria, Zahriladha
    Sulaiman, Zuraidah
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND TECHNOLOGY (ICAST'18), 2018, 2016
  • [33] Distributed Data Analytics Framework for Smart Transportation
    Howard, Alexander J.
    Lee, Tim
    Mahar, Sara
    Intrevado, Paul
    Myung-kyung, Diane
    [J]. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 1374 - 1380
  • [34] Visually Programming Dataflows for Distributed Data Analytics
    Thamsen, Lauritz
    Renner, Thomas
    Byfeld, Marvin
    Paeschke, Markus
    Schroeder, Daniel
    Boehm, Felix
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2285 - 2294
  • [35] Enabling Posthumous Medical Data Donation: An Appeal for the Ethical Utilisation of Personal Health Data
    Jenny Krutzinna
    Mariarosaria Taddeo
    Luciano Floridi
    [J]. Science and Engineering Ethics, 2019, 25 : 1357 - 1387
  • [36] Enabling Posthumous Medical Data Donation: An Appeal for the Ethical Utilisation of Personal Health Data
    Krutzinna, Jenny
    Taddeo, Mariarosaria
    Floridi, Luciano
    [J]. SCIENCE AND ENGINEERING ETHICS, 2019, 25 (05) : 1357 - 1387
  • [37] Pangea: Monolithic Distributed Storage for Data Analytics
    Zou, Jia
    Iyengar, Arun
    Jermaine, Chris
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (06): : 681 - 694
  • [38] Dynamic Behavioral Analytics in Weight-Loss Incentive Design Based on Personal Health Data
    Gong, Jianxia
    Zhao, Lindu
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 3822 - 3831
  • [39] Train Delay Prediction Systems: A Big Data Analytics Perspective
    Oneto, Luca
    Fumeo, Emanuele
    Clerico, Giorgio
    Canepa, Renzo
    Papa, Federico
    Dambra, Carlo
    Mazzino, Nadia
    Anguita, Davide
    [J]. BIG DATA RESEARCH, 2018, 11 : 54 - 64
  • [40] Speculative Distributed CSV Data Parsing for Big Data Analytics
    Ge, Chang
    Li, Yinan
    Eilebrecht, Eric
    Chandramouli, Badrish
    Kossmann, Donald
    [J]. SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 883 - 899