Analysis of Missingness Scenarios for Observational Health Data

被引:0
|
作者
Zamanian, Alireza [1 ,2 ]
von Kleist, Henrik [1 ,3 ]
Ciora, Octavia-Andreea [2 ]
Piperno, Marta [2 ]
Lancho, Gino [2 ]
Ahmidi, Narges [2 ]
机构
[1] Tech Univ Munich, TUM Sch Computat Informat & Technol, Dept Comp Sci, D-85748 Munich, Germany
[2] Fraunhofer Inst Cognit Syst IKS, D-80686 Munich, Germany
[3] Helmholtz Ctr Munich, Inst Computat Biol, D-80939 Munich, Germany
来源
JOURNAL OF PERSONALIZED MEDICINE | 2024年 / 14卷 / 05期
关键词
missing data analysis; observational health data; missingness scenarios; missing data assumptions; missingness distribution shift; MULTIPLE IMPUTATION; RISK; PREDICTION; MONOTONE; MODEL; SCORE;
D O I
10.3390/jpm14050514
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Simple Summary This paper argues the importance of considering domain knowledge when dealing with missing data in healthcare. We identify fundamental missingness scenarios in healthcare facilities and show how they impact the missing data analysis methods.Abstract Despite the extensive literature on missing data theory and cautionary articles emphasizing the importance of realistic analysis for healthcare data, a critical gap persists in incorporating domain knowledge into the missing data methods. In this paper, we argue that the remedy is to identify the key scenarios that lead to data missingness and investigate their theoretical implications. Based on this proposal, we first introduce an analysis framework where we investigate how different observation agents, such as physicians, influence the data availability and then scrutinize each scenario with respect to the steps in the missing data analysis. We apply this framework to the case study of observational data in healthcare facilities. We identify ten fundamental missingness scenarios and show how they influence the identification step for missing data graphical models, inverse probability weighting estimation, and exponential tilting sensitivity analysis. To emphasize how domain-informed analysis can improve method reliability, we conduct simulation studies under the influence of various missingness scenarios. We compare the results of three common methods in medical data analysis: complete-case analysis, Missforest imputation, and inverse probability weighting estimation. The experiments are conducted for two objectives: variable mean estimation and classification accuracy. We advocate for our analysis approach as a reference for the observational health data analysis. Beyond that, we also posit that the proposed analysis framework is applicable to other medical domains.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] A Conference (Missingness in Action) to Address Missingness in Data and AI in Health Care: Qualitative Thematic Analysis
    Rose, Christian
    Barber, Rachel
    Preiksaitis, Carl
    Kim, Ireh
    Mishra, Nikesh
    Kayser, Kristen
    Brown, Italo
    Gisondi, Michael
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [2] Integrative Clustering Analysis for Omics Data with Missingness
    Zhao, Yinqi
    Darst, Burcu
    Conti, David V.
    [J]. GENETIC EPIDEMIOLOGY, 2021, 45 (07) : 806 - 806
  • [3] Nonparametric analysis of factorial designs with random missingness: Bivariate data
    Akritas, Michael G.
    Antoniou, Efi S.
    Kuha, Jouni
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (476) : 1513 - 1526
  • [4] Gender-related data missingness, imbalance and bias in global health surveys
    Weber, Ann M.
    Gupta, Ribhav
    Abdalla, Safa
    Cislaghi, Beniamino
    Meausoone, Valerie
    Darmstadt, Gary L.
    [J]. BMJ GLOBAL HEALTH, 2021, 6 (11):
  • [5] Analyzing longitudinal clinical trial data with nonignorable missingness and unknown missingness reasons
    Xie, Hui
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (05) : 1287 - 1300
  • [6] Multiple Imputation for Robust Cluster Analysis to Address Missingness in Medical Data
    Harder, Arnold A.
    Olbricht, Gayla R.
    Ekuma, Godwin
    Hier, Daniel B.
    Obafemi-Ajayi, Tayo
    [J]. IEEE ACCESS, 2024, 12 : 42974 - 42991
  • [7] Which patients have missing data? An analysis of missingness in a trauma registry
    O'Reilly, Gerard M.
    Cameron, Peter A.
    Jolley, Damien J.
    [J]. INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2012, 43 (11): : 1917 - 1923
  • [8] THE ROLE OF MISSINGNESS IN DAILY DIARY DATA
    Yildiz, Mustafa
    Winstead, Vicki
    Pickering, Carolyn
    [J]. INNOVATION IN AGING, 2023, 7 : 604 - 604
  • [9] Bayesian causal inference for observational studies with missingness in covariates and outcomes
    Zang, Huaiyu
    Kim, Hang J.
    Huang, Bin
    Szczesniak, Rhonda
    [J]. BIOMETRICS, 2023, 79 (04) : 3624 - 3636
  • [10] Learning from data with structured missingness
    Mitra, Robin
    McGough, Sarah F.
    Chakraborti, Tapabrata
    Holmes, Chris
    Copping, Ryan
    Hagenbuch, Niels
    Biedermann, Stefanie
    Noonan, Jack
    Lehmann, Brieuc
    Shenvi, Aditi
    Doan, Xuan Vinh
    Leslie, David
    Bianconi, Ginestra
    Sanchez-Garcia, Ruben
    Davies, Alisha
    Mackintosh, Maxine
    Andrinopoulou, Eleni-Rosalina
    Basiri, Anahid
    Harbron, Chris
    MacArthur, Ben D.
    [J]. NATURE MACHINE INTELLIGENCE, 2023, 5 (01) : 13 - 23