Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data

被引:76
|
作者
Sehgal, MSB [1 ]
Gondal, I [1 ]
Dooley, LS [1 ]
机构
[1] Monash Univ, Gippsland Sch Comp & Informat Technol, Clayton, Vic 3842, Australia
关键词
D O I
10.1093/bioinformatics/bti345
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Microarray data are used in a range of application areas in biology, although often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible before using these algorithms. While many imputation algorithms have been proposed, more robust techniques need to be developed so that further analysis of biological data can be accurately undertaken. In this paper, an innovative missing value imputation algorithm called collateral missing value estimation (CMVE) is presented which uses multiple covariance-based imputation matrices for the final prediction of missing values. The matrices are computed and optimized using least square regression and linear programming methods. Results: The new CMVE algorithm has been compared with existing estimation techniques including Bayesian principal component analysis imputation (BPCA), least square impute (LSImpute) and K-nearest neighbour (KNN). All these methods were rigorously tested to estimate missing values in three separate non-time series (ovarian cancer based) and one time series (yeast sporulation) dataset. Each method was quantitatively analyzed using the normalized root mean square (NRMS) error measure, covering a wide range of randomly introduced missing value probabilities from 0.01 to 0.2. Experiments were also undertaken on the yeast dataset, which comprised 1.7% actual missing values, to test the hypothesis that CMVE performed better not only for randomly occurring but also for a real distribution of missing values. The results confirmed that CMVE consistently demonstrated superior and robust estimation capability of missing values compared with other methods for both series types of data, for the same order of computational complexity. A concise theoretical framework has also been formulated to validate the improved performance of the CMVE algorithm.
引用
收藏
页码:2417 / 2423
页数:7
相关论文
共 50 条
  • [1] Collateral missing value estimation: Robust missing value estimation for consequent microarray data processing
    Sehgal, MSB
    Gondal, I
    Dooley, L
    AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 274 - 283
  • [2] Triple Imputation for Microarray Missing Value Estimation
    He, Chong
    Li, Hui-Hui
    Zhao, Changbo
    Li, Guo-Zheng
    Zhang, Wei
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 208 - 213
  • [3] An accurate and robust missing value estimation for Microarray data: least absolute deviation imputation
    Cao, Yi
    Poh, Kim Leng
    ICMLA 2006: 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2006, : 157 - +
  • [4] An algorithm for missing value estimation for DNA microarray data
    Friedland, Shmuel
    Niknejad, Amir
    Kaveh, Mostafa
    Zare, Hossein
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 2340 - 2343
  • [5] A hybrid imputation approach for microarray missing value estimation
    Huihui Li
    Changbo Zhao
    Fengfeng Shao
    Guo-Zheng Li
    Xiao Wang
    BMC Genomics, 16
  • [6] A hybrid imputation approach for microarray missing value estimation
    Li, Huihui
    Zhao, Changbo
    Shao, Fengfeng
    Li, Guo-Zheng
    Wang, Xiao
    BMC GENOMICS, 2015, 16
  • [7] Semi-supervised Imputation for Microarray Missing Value Estimation
    Li, Hui-Hui
    Shao, Feng-Feng
    Li, Guo-Zheng
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [8] Integrative missing value estimation for microarray data
    Jianjun Hu
    Haifeng Li
    Michael S Waterman
    Xianghong Jasmine Zhou
    BMC Bioinformatics, 7
  • [9] A Review On Missing Value Estimation Using Imputation Algorithm
    Armina, Roslan
    Zain, Azlan Mohd
    Ali, Nor Azizah
    Sallehuddin, Roselina
    6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL MATHEMATICS (ICCSCM 2017), 2017, 892
  • [10] Integrative missing value estimation for microarray data
    Hu, Jianjun
    Li, Haifeng
    Waterman, Michael S.
    Zhou, Xianghong Jasmine
    BMC BIOINFORMATICS, 2006, 7 (1)