Synthetic Data Digital Twins and Data Trusts Control for Privacy in Health Data Sharing

被引:0
|
作者
Lomotey, Richard K. [1 ]
Kumi, Sandra [2 ]
Ray, Madhurima [3 ]
Deters, Ralph [2 ]
机构
[1] Penn State Univ, Informat Sci & Tech, Monaca, PA 15061 USA
[2] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada
[3] Penn State Univ, Dept Comp Sci, Monaca, PA USA
关键词
Synthetic Health Data; Digital Twins; Data Trusts; Machine Learning; Artificial Intelligence; Privacy; Middleware; FRAMEWORK;
D O I
10.1145/3643650.3658605
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Health data sharing is very valuable for medical research since it has the propensity to improve diagnostics, policy, medication, and so on. At the same time, sharing health data needs to be done without compromising the privacy of patients and stakeholders. However, recent advances in AI/ML and sophisticated analytics have proven to introduce biases that can easily identify patients based on their healthcare data, which violates privacy. In this work, we sort to address this major issue by exploring two emerging topics that are gaining attention from industry, academia, and governments, i.e., digital twins and data trusts. First, we proposed the use of digital twins (DTs) to generate synthetic records of patient's heart rate data. DTs are virtual replicas of the actual data and were created using two synthetic data generative models - Gaussian Copula (GC) and Tabular Variational Autoencoder (TVAE). The GC and TVAE achieved a maximum data quality score of 88% and 96% respectively. Next, we posit that the DTs should be shared with a data trusts layer. Data trusts are fiduciary frameworks that govern multi-party data sharing. The data trusts enforce access controls (based on metrics such as location, role-based, and policy-based) to the synthetic health data and reports to the data subject. The preliminary evaluations of the work show that merging the two techniques (i.e., synthetic data digital twins and data trusts) enforces better privacy for health data access. The synthetic data ensures more anonymization while the data trusts provide easy auditing, tracking, and efficient reporting to the patient or data subject. The paper also detailed the architectural design of the data trusts and evaluated the efficiency of the access control techniques.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [1] SleepSynth: Evaluating the use of Synthetic Data in Health Digital Twins
    Kumi, Sandra
    Hilton, Maxwell
    Snow, Charled
    Lomotey, Richard K.
    Deters, Ralph
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON DIGITAL HEALTH, ICDH, 2023, : 121 - 130
  • [2] Digital health fiduciaries: protecting user privacy when sharing health data
    Chirag Arora
    [J]. Ethics and Information Technology, 2019, 21 : 181 - 196
  • [3] Digital health fiduciaries: protecting user privacy when sharing health data
    Arora, Chirag
    [J]. ETHICS AND INFORMATION TECHNOLOGY, 2019, 21 (03) : 181 - 196
  • [4] Data Sharing Governance in Digital Twins and Smart Cities: The European Data Strategy
    Sapienza, Salvatore
    Palmirani, Monica
    Greco, Sara
    [J]. ELECTRONIC GOVERNMENT AND THE INFORMATION SYSTEMS PERSPECTIVE, EGOVIS 2024, 2024, 14913 : 184 - 197
  • [5] Health Data and Privacy in the Digital Era
    Gostin, Lawrence O.
    Halabi, Sam F.
    Wilson, Kumanan
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2018, 320 (03): : 233 - 234
  • [6] Reliability of Supervised Machine Learning Using Synthetic Data in Health Care: Model to Preserve Privacy for Data Sharing
    Rankin, Debbie
    Black, Michaela
    Bond, Raymond
    Wallace, Jonathan
    Mulvenna, Maurice
    Epelde, Gorka
    [J]. JMIR MEDICAL INFORMATICS, 2020, 8 (07)
  • [7] Digital Twins for Stress Management Utilizing Synthetic Data
    Kumi, Sandra
    Ray, Madhurima
    Walia, Sanskriti
    Lomotey, Richard K.
    Deters, Ralph
    [J]. 2024 IEEE 5TH ANNUAL WORLD AI IOT CONGRESS, AIIOT 2024, 2024, : 0329 - 0335
  • [8] Data trusts will not be the final word on data sharing, but they might help
    Hardinges, Jack
    Wells, Peter
    [J]. PUBLIC MONEY & MANAGEMENT, 2019, 39 (05) : 320 - 321
  • [9] Synthetic data generation for digital twins: enabling production systems analysis in the absence of data
    Lopes, Paulo Victor
    Silveira, Leonardo
    Guimaraes Aquino, Roberto Douglas
    Ribeiro, Carlos Henrique
    Skoogh, Anders
    Verri, Filipe Alves Neto
    [J]. INTERNATIONAL JOURNAL OF COMPUTER INTEGRATED MANUFACTURING, 2024,
  • [10] Sharing health-related data: a privacy test?
    Dyke, Stephanie O. M.
    Dove, Edward S.
    Knoppers, Bartha M.
    [J]. NPJ GENOMIC MEDICINE, 2016, 1