Missing Data Imputation Toolbox for MATLAB

被引:60
|
作者
Folch-Fortuny, Abel [1 ]
Arteaga, Francisco [2 ]
Ferrer, Alberto [1 ]
机构
[1] Univ Politecn Valencia, Dept Estadist & Invest Operat Aplicadas & Calidad, Camino Vera S-N,Edificio 7A, E-46022 Valencia, Spain
[2] Univ Catolica Valencia San Vicente Martir, Dept Biostat & Invest, C Quevedo 2, Valencia 46001, Spain
关键词
Missing data; Imputation; PCA model building; REGRESSION;
D O I
10.1016/j.chemolab.2016.03.019
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Here we introduce a graphical user-friendly interface to deal with missing values called Missing Data Imputation (MDI) Toolbox. This MATLAB toolbox allows imputing missing values, following missing completely at random patterns, exploiting the relationships among variables. In this way, principal component analysis (PCA) models are fitted iteratively to impute the missing data until convergence. Different methods, using PCA internally, are included in the toolbox: trimmed scores regression (TSR), known data regression (KDR), KDR with principal component regression (KDR-PCR), KDR with partial least squares regression (KDR-PLS), projection to the model plane (PMP), iterative algorithm (IA), modified nonlinear iterative partial least squares regression algorithm (NIPALS) and data augmentation (DA). MDI Toolbox presents a general procedure to impute missing data, thus can be used to infer PCA models with missing data, to estimate the covariance structure of incomplete data matrices, or to impute the missing values as a preprocessing step of other methodologies. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:93 / 100
页数:8
相关论文
共 50 条
  • [31] Evaluating the Impact of Missing Data Imputation
    Pantanowitz, Adam
    Marwala, Tshildzi
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2009, 5678 : 577 - 586
  • [32] Optimized parameters for missing data imputation
    Zhang, Shichao
    Qin, Yongsong
    Zhu, Xiaofeng
    Zhang, Jilian
    Zhang, Chengqi
    [J]. PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 1010 - 1016
  • [33] MISSING DATA, IMPUTATION AND REGRESSION TREES
    Loh, Wei-Yin
    Zhang, Qiong
    Zhang, Wenwen
    Zhou, Peigen
    [J]. STATISTICA SINICA, 2020, 30 (04) : 1697 - 1722
  • [34] Cooperative Clustering Missing Data Imputation
    Wan, Daoming
    Razavi-Far, Roozbeh
    Saif, Mehrdad
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 1039 - 1045
  • [35] Imputation of missing data in industrial databases
    Lakshminarayan, K
    Harp, SA
    Samad, T
    [J]. APPLIED INTELLIGENCE, 1999, 11 (03) : 259 - 275
  • [36] Multiple imputation for nonignorable missing data
    Jongho Im
    Soeun Kim
    [J]. Journal of the Korean Statistical Society, 2017, 46 : 583 - 592
  • [37] Missing phenotype data imputation in pedigree data analysis
    Fridley, B
    de Andrade, M
    [J]. GENETIC EPIDEMIOLOGY, 2005, 29 (03) : 249 - 249
  • [38] Missing phenotype data imputation in pedigree data analysis
    Fridley, Brooke L.
    de Andrade, Mariza
    [J]. GENETIC EPIDEMIOLOGY, 2008, 32 (01) : 52 - 60
  • [39] Missing Data Imputation with High-Dimensional Data
    Brini, Alberto
    van den Heuvel, Edwin R.
    [J]. AMERICAN STATISTICIAN, 2024, 78 (02): : 240 - 252
  • [40] Missing data imputation in multivariate data by evolutionary algorithms
    Figueroa Garcia, Juan C.
    Kalenatic, Dusko
    Lopez Bello, Cesar Amilcar
    [J]. COMPUTERS IN HUMAN BEHAVIOR, 2011, 27 (05) : 1468 - 1474