A cluster-directed framework for neighbour based imputation of missing value in microarray data

被引:17
|
作者
Keerin, Phimmarin [1 ]
Kurutach, Werasak [1 ]
Boongoen, Tossapon [2 ]
机构
[1] Mahanakorn Univ Technol, Fac Informat Sci & Technol, Bangkok, Thailand
[2] Navaminda Kasatriyadhiraj Royal Air Force Acad, Dept Math & Comp Sci, Bangkok, Thailand
关键词
missing value; imputation; gene expression data; clustering; regression; nearest neighbour; GENE-EXPRESSION DATA; CANCER;
D O I
10.1504/IJDMB.2016.076535
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
DNA microarray has been the most widely used functional genomics approach in bioinformatics. However, microarray data suffer from frequent missing values due to various experimental and data handling reasons. Leaving this unsolved may degrade the reliability of any consequent downstream analysis. As such, missing value imputation has been recognised as an important pre-processing step, which can yield the quality of data and its interpretation. Several techniques found in the literature have successfully exploited the characteristics and relations among a set of genes closest to the one under examination. However, the selection of so-called nearest neighbours is based simply on proximity between gene pairs, without taking the structural or grouping information into account. In response, this paper proposes a novel cluster-directed framework (CFNI: Cluster-directed Framework for Neighbour-based Imputation), in which data clustering is uniquely used to guide the identification of nearest neighbours. This allows a more accurate imputed value to be derived. Not only it performs better than several benchmark methods on published microarray data sets; it is also generalised such that any neighbour-based imputation technique can be coupled with the proposed model. This has been successfully demonstrated with both single pass and iterative models.
引用
收藏
页码:165 / 193
页数:29
相关论文
共 50 条
  • [21] An efficient ensemble method for missing value imputation in microarray gene expression data
    Zhu, Xinshan
    Wang, Jiayu
    Sun, Biao
    Ren, Chao
    Yang, Ting
    Ding, Jie
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)
  • [22] Improving missing value imputation of microarray data by using spot quality weights
    Peter Johansson
    Jari Häkkinen
    [J]. BMC Bioinformatics, 7
  • [23] Missing value imputation improves clustering and interpretation of gene expression microarray data
    Johannes Tuikkala
    Laura L Elo
    Olli S Nevalainen
    Tero Aittokallio
    [J]. BMC Bioinformatics, 9
  • [24] Smoothing Blemished Gene Expression Microarray Data via Missing Value Imputation
    Cai, Zhipeng
    Shi, Yi
    Song, Meng
    Goebel, Randy
    Lin, Guohui
    [J]. 2008 30TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-8, 2008, : 5688 - 5691
  • [25] Improving missing value imputation of microarray data by using spot quality weights
    Johansson, Peter
    Hakkinen, Jari
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [26] Missing value estimation for microarray data through cluster analysis
    Soumen Kumar Pati
    Asit Kumar Das
    [J]. Knowledge and Information Systems, 2017, 52 : 709 - 750
  • [27] MICROARRAY MISSING DATA IMPUTATION USING REGRESSION
    Bayrak, Tuncay
    Ogul, Hasan
    [J]. 2017 13TH IASTED INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING (BIOMED), 2017, : 68 - 73
  • [28] Missing value estimation for microarray data through cluster analysis
    Pati, Soumen Kumar
    Das, Asit Kumar
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 52 (03) : 709 - 750
  • [29] A hybrid imputation approach for microarray missing value estimation
    Huihui Li
    Changbo Zhao
    Fengfeng Shao
    Guo-Zheng Li
    Xiao Wang
    [J]. BMC Genomics, 16
  • [30] A hybrid imputation approach for microarray missing value estimation
    Li, Huihui
    Zhao, Changbo
    Shao, Fengfeng
    Li, Guo-Zheng
    Wang, Xiao
    [J]. BMC GENOMICS, 2015, 16