Gaussian processes for missing value imputation

被引:10
|
作者
Jafrasteh, Bahram [1 ]
Hernandez-Lobato, Daniel [2 ]
Lubian-Lopez, Simon Pedro
Benavente-Fernandez, Isabel [1 ,3 ,4 ]
机构
[1] Puerta Mar Univ, Biomed Res & Innovat Inst, Cadiz INiB Res Unit, Cadiz, Spain
[2] Univ Autonoma Madrid, Comp Sci Dept, Madrid, Spain
[3] Puerta Mar Univ Hosp, Dept Pediat, Div Neonatol, Cadiz, Spain
[4] Univ Cddiz, Med Sch, Dept Child & Mother Hlth & Radiol, Area Pediat, Cadiz, Spain
关键词
Missing values; Gaussian process; Deep learning; Deep Gaussian processes; Variational inference;
D O I
10.1016/j.knosys.2023.110603
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A missing value indicates that a particular attribute of an instance of a learning problem is not recorded. They are very common in many real-life datasets. In spite of this, however, most machine learning methods cannot handle missing values. Thus, they should be imputed before training. Gaussian Processes (GPs) are non-parametric models with accurate uncertainty estimates that combined with sparse approximations and stochastic variational inference scale to large data sets. Sparse GPs (SGPs) can be used to get a predictive distribution for missing values. We present a hierarchical composition of sparse GPs that is used to predict the missing values at each dimension using the observed values from the other dimensions. Importantly, we consider that the input attributes to each sparse GP used for prediction may also have missing values. The missing values in those input attributes are replaced by the predictions of the previous sparse GPs in the hierarchy. We call our approach missing GP (MGP). MGP can impute all observed missing values. It outputs a predictive distribution for each missing value that is then used in the imputation of other missing values. We evaluate MGP on one private clinical data set and on four UCI datasets with a different percentage of missing values. Furthermore, we compare the performance of MGP with other state-of-the-art methods for imputing missing values, including variants based on sparse GPs and deep GPs. Our results show that the performance of MGP is significantly better. (c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Optimization of missing value imputation for neural networks
    Han, Jongmin
    Kang, Seokho
    INFORMATION SCIENCES, 2023, 649
  • [22] Missing Value Imputation: With Application to Handwriting Data
    Xu, Zhen
    Srihari, Sargur N.
    DOCUMENT RECOGNITION AND RETRIEVAL XXII, 2015, 9402
  • [23] Soft Clustering Based Missing Value Imputation
    Raja, P. S.
    Thangavel, K.
    DIGITAL CONNECTIVITY - SOCIAL IMPACT, 2016, 679 : 119 - 133
  • [24] The importance of batch sensitization in missing value imputation
    Harvard Wai Hann Hui
    Weijia Kong
    Hui Peng
    Wilson Wen Bin Goh
    Scientific Reports, 13
  • [25] Robust Estimation of Gaussian Mixture Models Using Anomaly Scores and Bayesian Information Criterion for Missing Value Imputation
    Mouret, F.
    Albughdadi, M.
    Duthoit, S.
    Kouame, D.
    Tourneret, J-Y
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 827 - 831
  • [26] Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data
    Sehgal, MSB
    Gondal, I
    Dooley, LS
    BIOINFORMATICS, 2005, 21 (10) : 2417 - 2423
  • [27] Simultaneous Missing Value Imputation and Structure Learning with Groups
    Morales-Alvarez, Pablo
    Gong, Wenbo
    Lamb, Angus
    Woodhead, Simon
    Jones, Simon Peyton
    Pawlowski, Nick
    Allamanis, Miltiadis
    Zhang, Cheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [28] Missing value imputation in longitudinal measures of alcohol consumption
    Grittner, Ulrike
    Gmel, Gerhard
    Ripatti, Samuli
    Bloomfield, Kim
    Wicki, Matthias
    INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) : 50 - 61
  • [29] Combining instance selection for better missing value imputation
    Tsai, Chih-Fong
    Chang, Fu-Yu
    JOURNAL OF SYSTEMS AND SOFTWARE, 2016, 122 : 63 - 71
  • [30] Missing Value Imputation via Clusterwise Linear Regression
    Karmitsa, Napsu
    Taheri, Sona
    Bagirov, Adil
    Makinen, Pauliina
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) : 1889 - 1901