A cluster-directed framework for neighbour based imputation of missing value in microarray data

被引:17
|
作者
Keerin, Phimmarin [1 ]
Kurutach, Werasak [1 ]
Boongoen, Tossapon [2 ]
机构
[1] Mahanakorn Univ Technol, Fac Informat Sci & Technol, Bangkok, Thailand
[2] Navaminda Kasatriyadhiraj Royal Air Force Acad, Dept Math & Comp Sci, Bangkok, Thailand
关键词
missing value; imputation; gene expression data; clustering; regression; nearest neighbour; GENE-EXPRESSION DATA; CANCER;
D O I
10.1504/IJDMB.2016.076535
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
DNA microarray has been the most widely used functional genomics approach in bioinformatics. However, microarray data suffer from frequent missing values due to various experimental and data handling reasons. Leaving this unsolved may degrade the reliability of any consequent downstream analysis. As such, missing value imputation has been recognised as an important pre-processing step, which can yield the quality of data and its interpretation. Several techniques found in the literature have successfully exploited the characteristics and relations among a set of genes closest to the one under examination. However, the selection of so-called nearest neighbours is based simply on proximity between gene pairs, without taking the structural or grouping information into account. In response, this paper proposes a novel cluster-directed framework (CFNI: Cluster-directed Framework for Neighbour-based Imputation), in which data clustering is uniquely used to guide the identification of nearest neighbours. This allows a more accurate imputed value to be derived. Not only it performs better than several benchmark methods on published microarray data sets; it is also generalised such that any neighbour-based imputation technique can be coupled with the proposed model. This has been successfully demonstrated with both single pass and iterative models.
引用
收藏
页码:165 / 193
页数:29
相关论文
共 50 条
  • [1] Cluster-based KNN Missing Value Imputation for DNA Microarray Data
    Keerin, Phimmarin
    Kurutach, Werasak
    Boongoen, Tossapon
    [J]. PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 445 - 450
  • [2] An Improvement of Missing Value Imputation in DNA Microarray Data Using Cluster-based LLS Method
    Keerin, Phimmarin
    Kurutach, Werasak
    Boongoen, Tossapon
    [J]. 2013 13TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT): COMMUNICATION AND INFORMATION TECHNOLOGY FOR NEW LIFE STYLE BEYOND THE CLOUD, 2013, : 559 - 564
  • [3] Microarray missing data imputation based on a set theoretic framework and biological constraints
    Gan, Xiangchao
    Liew, Alan Wee-Chung
    Yan, Hong
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS, 2006, : 842 - +
  • [4] Microarray missing data imputation based on a set theoretic framework and biological knowledge
    Gan, XC
    Liew, AWC
    Yan, H
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 (05) : 1608 - 1619
  • [5] Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbour
    Aditya Dubey
    Akhtar Rasool
    [J]. Scientific Reports, 11
  • [6] Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbour
    Dubey, Aditya
    Rasool, Akhtar
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [7] Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data
    Sehgal, MSB
    Gondal, I
    Dooley, LS
    [J]. BIOINFORMATICS, 2005, 21 (10) : 2417 - 2423
  • [8] KNN-DTW Based Missing Value Imputation for Microarray Time Series Data
    Hsu, Hui-Huang
    Yang, Andy C.
    Lu, Ming-Da
    [J]. JOURNAL OF COMPUTERS, 2011, 6 (03) : 418 - 425
  • [9] An Efficient Technique for Missing value Imputation in Microarray Gene Expression Data
    Valarmathie, P.
    Dinakaran, K.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND SYSTEMS (ICCCS'14), 2014, : 73 - 80
  • [10] A Review on Missing Value Imputation Algorithms for Microarray Gene Expression Data
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    Deris, Safaai
    [J]. CURRENT BIOINFORMATICS, 2014, 9 (01) : 18 - 22