A Multi-directional Approach for Missing Value Estimation in Multivariate Time Series Clinical Data

被引:6
|
作者
Xu, Xiao [1 ]
Liu, Xiaoshuang [1 ]
Kang, Yanni [1 ]
Xu, Xian [1 ]
Wang, Junmei [1 ]
Sun, Yuyao [1 ]
Chen, Quanhe [1 ]
Jia, Xiaoyu [1 ]
Ma, Xinyue [1 ]
Meng, Xiaoyan [1 ]
Li, Xiang [1 ]
Xie, Guotong [1 ]
机构
[1] Ping Hlth Technol, Beijing, Peoples R China
关键词
Multi-directional; Missing Value Estimation; Multivariate time series; Feature engineering; Gradient boosting tree; IMPUTATION;
D O I
10.1007/s41666-020-00076-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing values are common in clinical datasets which bring obstacles for clinical data analysis. Correctly estimating the missing parts plays a critical role in utilizing these analysis approaches. However, only limited works focus on the missing value estimation of multivariate time series (MTS) clinical data, which is one of the most challenge data types in this area. We attempt to develop a methodology (MD-MTS) with high accuracy for the missing value estimation in MTS clinical data. In MD-MTS, temporal and cross-variable information are constructed as multi-directional features for an efficient gradient boosting decision tree (LightGBM). For each patient, temporal information represents the sequential relations among the values of one variable in different time-stamps, and cross-variable information refers to the correlations among the values of different variables in a fixed time-stamp. We evaluated the estimation method performance based on the gap between the true values and the estimated values on the randomly masked parts. MD-MTS outperformed three baseline methods (3D-MICE, Amelia II and BRITS) on the ICHI challenge 2019 datasets that containing 13 time series variables. The root-mean-square error of MD-MTS, 3D-MICE, Amelia II and BRITS on offline-test dataset are 0.1717, 0.2247, 0.1900, and 0.1862, respectively. On online-test dataset, the performance for the former three methods is 0.1720, 0.2235, and 0.1927, respectively. Furthermore, MD-MTS got the first in ICHI challenge 2019 among dozens of competition models. MD-MTS provides an accurate and robust approach for estimating the missing values in MTS clinical data, which can be easily used as a preprocessing step for the downstream clinical data analysis.
引用
收藏
页码:365 / 382
页数:18
相关论文
共 50 条
  • [41] RULE EVOLUTION APPROACH FOR MINING MULTIVARIATE TIME SERIES DATA
    Nguyen, Viet-An
    Gopalkrishnan, Vivekanand
    ICEIS 2008: PROCEEDINGS OF THE TENTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL AIDSS: ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS, 2008, : 19 - 26
  • [42] General value functions for fault detection in multivariate time series data
    Wong, Andy
    Jazi, Mehran Taghian
    Takeuchi, Tomoharu
    Gunther, Johannes
    Zaiane, Osmar
    FRONTIERS IN ROBOTICS AND AI, 2024, 11
  • [43] ALTERNATE APPROACH TO MISSING VALUE ESTIMATION
    JAECH, JL
    AMERICAN STATISTICIAN, 1966, 20 (05): : 27 - 29
  • [44] Shrinkage estimation for multivariate time series
    Yan Liu
    Yoshiyuki Tanida
    Masanobu Taniguchi
    Statistical Inference for Stochastic Processes, 2021, 24 : 733 - 751
  • [45] Shrinkage estimation for multivariate time series
    Liu, Yan
    Tanida, Yoshiyuki
    Taniguchi, Masanobu
    STATISTICAL INFERENCE FOR STOCHASTIC PROCESSES, 2021, 24 (03) : 733 - 751
  • [46] High capacity data hiding based on multi-directional pixel value differencing and decreased difference expansion
    Pratap Chandra Mandal
    Imon Mukherjee
    Multimedia Tools and Applications, 2022, 81 : 5325 - 5347
  • [47] Missing value imputation in multivariate time series with end-to-end generative adversarial networks
    Zhang, Ying
    Zhou, Baohang
    Cai, Xiangrui
    Guo, Wenya
    Ding, Xiaoke
    Yuan, Xiaojie
    Information Sciences, 2021, 551 : 67 - 82
  • [48] The Relationship of Time Span and Missing Data on the Noise Model Estimation of GNSS Time Series
    Sun, Xiwen
    Lu, Tieding
    Hu, Shunqiang
    Huang, Jiahui
    He, Xiaoxing
    Montillet, Jean-Philippe
    Ma, Xiaping
    Huang, Zhengkai
    REMOTE SENSING, 2023, 15 (14)
  • [49] High capacity data hiding based on multi-directional pixel value differencing and decreased difference expansion
    Mandal, Pratap Chandra
    Mukherjee, Imon
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (04) : 5325 - 5347
  • [50] Missing value imputation in multivariate time series with end-to-end generative adversarial networks
    Zhang, Ying
    Zhou, Baohang
    Cai, Xiangrui
    Guo, Wenya
    Ding, Xiaoke
    Yuan, Xiaojie
    INFORMATION SCIENCES, 2021, 551 : 67 - 82