Gaussian processes for missing value imputation

被引：10

作者：

Jafrasteh, Bahram ^{[1
]}

Hernandez-Lobato, Daniel ^{[2
]}

Lubian-Lopez, Simon Pedro

Benavente-Fernandez, Isabel ^{[1
,3
,4
]}

机构：

[1] Puerta Mar Univ, Biomed Res & Innovat Inst, Cadiz INiB Res Unit, Cadiz, Spain

[2] Univ Autonoma Madrid, Comp Sci Dept, Madrid, Spain

[3] Puerta Mar Univ Hosp, Dept Pediat, Div Neonatol, Cadiz, Spain

[4] Univ Cddiz, Med Sch, Dept Child & Mother Hlth & Radiol, Area Pediat, Cadiz, Spain

来源：

KNOWLEDGE-BASED SYSTEMS | 2023年 / 273卷

关键词：

Missing values; Gaussian process; Deep learning; Deep Gaussian processes; Variational inference;

D O I：

10.1016/j.knosys.2023.110603

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A missing value indicates that a particular attribute of an instance of a learning problem is not recorded. They are very common in many real-life datasets. In spite of this, however, most machine learning methods cannot handle missing values. Thus, they should be imputed before training. Gaussian Processes (GPs) are non-parametric models with accurate uncertainty estimates that combined with sparse approximations and stochastic variational inference scale to large data sets. Sparse GPs (SGPs) can be used to get a predictive distribution for missing values. We present a hierarchical composition of sparse GPs that is used to predict the missing values at each dimension using the observed values from the other dimensions. Importantly, we consider that the input attributes to each sparse GP used for prediction may also have missing values. The missing values in those input attributes are replaced by the predictions of the previous sparse GPs in the hierarchy. We call our approach missing GP (MGP). MGP can impute all observed missing values. It outputs a predictive distribution for each missing value that is then used in the imputation of other missing values. We evaluate MGP on one private clinical data set and on four UCI datasets with a different percentage of missing values. Furthermore, we compare the performance of MGP with other state-of-the-art methods for imputing missing values, including variants based on sparse GPs and deep GPs. Our results show that the performance of MGP is significantly better. (c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

引用

页数：12

共 50 条

[41] Neighborhood-aware autoencoder for missing value imputation
Aidos, Helena
Tomas, Pedro
28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1542 - 1546
[42] A New Method to Missing Value Imputation for Immunosignature Data
Koshechkin, A. A.
Andryushchenko, V. S.
Zamyatin, A., V
SOVREMENNYE TEHNOLOGII V MEDICINE, 2019, 11 (02) : 19 - 23
[43] A robust missing value imputation method for noisy data
Zhu, Bing
He, Changzheng
Liatsis, Panos
APPLIED INTELLIGENCE, 2012, 36 (01) : 61 - 74
[44] Missing Value Imputation Methods for Electronic Health Records
Psychogyios, Konstantinos
Ilias, Loukas
Ntanos, Christos
Askounis, Dimitris
IEEE ACCESS, 2023, 11 : 21562 - 21574
[45] Block Tensor Train Decomposition for Missing Value Imputation
Lee, Namgil
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1338 - 1343
[46] A class center based approach for missing value imputation
Tsai, Chih-Fong
Li, Miao-Ling
Lin, Wei-Chao
KNOWLEDGE-BASED SYSTEMS, 2018, 151 : 124 - 135
[47] Missing value imputation on missing completely at random data using multilayer perceptrons
Silva-Ramirez, Esther-Lydia
Pino-Mejias, Rafael
Lopez-Coello, Manuel
Cubiles-de-la-Vega, Maria-Dolores
NEURAL NETWORKS, 2011, 24 (01) : 121 - 129
[48] Distributed personalized imputation based on Gaussian mixture model for missing data
Chen S.
Liu Y.
Neural Computing and Applications, 2024, 36 (23) : 14237 - 14250
[49] Missing value imputation in a data matrix using the regularised singular value decomposition
Arciniegas-Alarcon, Sergio
Garcia-Pena, Marisol
Krzanowski, Wojtek J.
Rengifo, Camilo
METHODSX, 2023, 11
[50] Fuzzy rough assisted missing value imputation and feature selection
Jain, Pankhuri
Tiwari, Anoop
Som, Tanmoy
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (03): : 2773 - 2793

← 1 2 3 4 5 →