Estimating Missing Data in Temporal Data Streams Using Multi-Directional Recurrent Neural Networks

被引:154
|
作者
Yoon, Jinsung [1 ]
Zame, William R. [2 ]
van der Schaar, Mihaela [3 ,4 ]
机构
[1] Univ Calif Los Angeles, Dept Elect & Comp Engn, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Econ & Math, Los Angeles, CA USA
[3] Univ Oxford, Dept Engn Sci, Oxford, England
[4] Alan Turing Inst, London, England
基金
美国国家科学基金会;
关键词
Missing data; temporal data streams; imputation; recurrent neural nets; MULTIPLE-IMPUTATION;
D O I
10.1109/TBME.2018.2874712
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Missing data is a ubiquitous problem. It is especially challenging in medical settings because many streams of measurements are collected at different-and often irregular-times. Accurate estimation of the missing measurements is critical for many reasons, including diagnosis, prognosis, and treatment. Existing methods address this estimation problem by interpolating within data streams or imputing across data streams (both of which ignore important information) or ignoring the temporal aspect of the data and imposing strong assumptions about the nature of the data-generating process and/or the pattern of missing data (both of which are especially problematic for medical data). We propose a new approach, based on a novel deep learning architecture that we call a Multi-directional Recurrent Neural Network that interpolates within data streams and imputes across data streams. We demonstrate the power of our approach by applying it to five real-world medical datasets. We show that it provides dramatically improved estimation of missing measurements in comparison to 11 state-of-the-art benchmarks (including Spline and Cubic Interpolations, MICE, MissForest, matrix completion, and several RNN methods); typical improvements in Root Mean Squared Error are between 35%-50%. Additional experiments based on the same five datasets demonstrate that the improvements provided by our method are extremely robust.
引用
收藏
页码:1477 / 1490
页数:14
相关论文
共 50 条
  • [31] Medical examination data prediction with missing information imputation based on recurrent neural networks
    Kim, Han-Gyu
    Jang, Gil-Jin
    Choi, Ho-Jin
    Lim, Myungeun
    Choi, Jaehun
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2018, 19 (03) : 202 - 220
  • [32] Recurrent neural networks for fuzzy data
    Freitag, Steffen
    Graf, Wolfgang
    Kaliske, Michael
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2011, 18 (03) : 265 - 280
  • [33] Analysis of missing data with artificial neural networks
    Pastor, JBN
    Vidal, JML
    PSICOTHEMA, 2000, 12 (03) : 503 - 510
  • [34] MisConv: Convolutional Neural Networks for Missing Data
    Likowski, Marcin Przewiez
    Smieja, Marek
    Struski, Lukasz
    Tabor, Jacek
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2917 - 2926
  • [35] Imputation of missing data with neural networks for classification
    Choudhury, Suyra Jyoti
    Pal, Nikhil R.
    KNOWLEDGE-BASED SYSTEMS, 2019, 182
  • [36] DeepZip: Lossless Data Compression using Recurrent Neural Networks
    Goyal, Mohit
    Tatwawadi, Kedar
    Chandak, Shubham
    Ochoa, Idoia
    2019 DATA COMPRESSION CONFERENCE (DCC), 2019, : 575 - 575
  • [37] Multimedia Data Modelling Using Multidimensional Recurrent Neural Networks
    He, Zhen
    Gao, Shaobing
    Xiao, Liang
    Liu, Daxue
    He, Hangen
    SYMMETRY-BASEL, 2018, 10 (09):
  • [38] Reconstruction of Cross-Sectional Missing Data Using Neural Networks
    Gheyas, Iffat A.
    Smith, Leslie S.
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PROCEEDINGS, 2009, 43 : 28 - 34
  • [39] Spatio-temporal graph neural networks for missing data completion in traffic prediction
    Chen, Jiahui
    Yang, Lina
    Yang, Yi
    Peng, Ling
    Ge, Xingtong
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2024,
  • [40] A dynamic programming approach to missing data estimation using neural networks
    Nelwamondo, Fulufhelo V.
    Golding, Dan
    Marwala, Tshilidzi
    INFORMATION SCIENCES, 2013, 237 : 49 - 58