Shrinkage regression-based methods for microarray missing value imputation

被引:10
|
作者
Wang, Hsiuying [1 ]
Chiu, Chia-Chun [2 ]
Wu, Yi-Ching [1 ]
Wu, Wei-Sheng [2 ]
机构
[1] Natl Chiao Tung Univ, Inst Stat, Hsinchu 300, Taiwan
[2] Natl Cheng Kung Univ, Dept Elect Engn, Tainan 701, Taiwan
来源
BMC SYSTEMS BIOLOGY | 2013年 / 7卷
关键词
CELL-CYCLE TRANSCRIPTION; GENE-EXPRESSION PATTERNS; REGULATORY MODULES; IDENTIFICATION; LYMPHOMA;
D O I
10.1186/1752-0509-7-S6-S11
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. Results: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Conclusions: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Comparison of Imputation Methods Based on Missing Value Detection for Multidimensional Feature Data
    Qiao, Fei
    Zhai, Xiaodong
    Wang, Qiaoling
    [J]. Tongji Daxue Xuebao/Journal of Tongji University, 2023, 51 (12): : 1972 - 1982
  • [32] Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme
    Xian Wang
    Ao Li
    Zhaohui Jiang
    Huanqing Feng
    [J]. BMC Bioinformatics, 7
  • [33] Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme
    Wang, X
    Li, A
    Jiang, ZH
    Feng, HQ
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [34] Genetic Programming-Based Selection of Imputation Methods in Symbolic Regression with Missing Values
    Al-Helali, Baligh
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    [J]. AI 2020: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 12576 : 163 - 175
  • [35] Evaluating model-based imputation methods for missing covariates in regression models with interactions
    Kim, Soeun
    Sugar, Catherine A.
    Belin, Thomas R.
    [J]. STATISTICS IN MEDICINE, 2015, 34 (11) : 1876 - 1888
  • [36] Soft Clustering Based Missing Value Imputation
    Raja, P. S.
    Thangavel, K.
    [J]. DIGITAL CONNECTIVITY - SOCIAL IMPACT, 2016, 679 : 119 - 133
  • [37] Regression-Based Approach to Test Missing Data Mechanisms
    Rouzinov, Serguei
    Berchtold, Andre
    [J]. DATA, 2022, 7 (02)
  • [38] REGRESSION-BASED FORECAST COMBINATION METHODS
    Wei, Xiaoqiao
    [J]. ROMANIAN JOURNAL OF ECONOMIC FORECASTING, 2009, 12 (04): : 5 - 18
  • [39] Sequential local least squares imputation estimating missing value of microarray data
    Zhang, Xiaobai
    Song, Xiaofeng
    Wang, Huinan
    Zhan, Huanping
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2008, 38 (10) : 1112 - 1120
  • [40] An efficient ensemble method for missing value imputation in microarray gene expression data
    Xinshan Zhu
    Jiayu Wang
    Biao Sun
    Chao Ren
    Ting Yang
    Jie Ding
    [J]. BMC Bioinformatics, 22