Cluster-based KNN Missing Value Imputation for DNA Microarray Data

被引:0
|
作者
Keerin, Phimmarin [1 ]
Kurutach, Werasak [1 ]
Boongoen, Tossapon [2 ]
机构
[1] Mahanakorn Univ Technol, Fac Informat Sci & Technol, Bangkok, Thailand
[2] Royal Thai Air Force Acad, Dept Math & Comp Sci, Bangkok, Thailand
关键词
missing value; imputation; microarray data; clustering; EXPRESSION DATA; CLASSIFICATION; CANCER;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Gene expressions measured using microarrays usually encounter the problem of missing values. Leaving this unsolved may critically degrade the reliability of any consequent downstream analysis or medical application. Yet, a further study of microarray data might be impossible with many analysis methods requiring a complete data set. This paper introduces a new methodology to impute missing values in microarray data. The proposed algorithm, CKNN impute, is an extension of k nearest neighbor imputation with local data clustering being incorporated for improved quality and efficiency. Gene expression data is typically represented as a matrix whose rows and columns correspond to genes and experiments, respectively. CKNN kicks off by finding a complete dataset via the removal of rows with missing value(s). Then, k clusters and their corresponding centroids are obtained by applying a clustering technique on the complete dataset. A set of similar genes of the target gene (with missing values) are those belonging to the cluster, whose centroid is the closest the target. Having known this, the target gene is imputed by applying k nearest neighbor method with similar genes previously determined. Empirical evaluation with published gene expression datasets suggest that the proposed technique performs better than the classical k nearest neighbor method and its extension found in the literature.
引用
收藏
页码:445 / 450
页数:6
相关论文
共 50 条
  • [31] A hybrid imputation approach for microarray missing value estimation
    Li, Huihui
    Zhao, Changbo
    Shao, Fengfeng
    Li, Guo-Zheng
    Wang, Xiao
    [J]. BMC GENOMICS, 2015, 16
  • [32] Incorporating Nonlinear Relationships in Microarray Missing Value Imputation
    Yu, Tianwei
    Peng, Hesen
    Sun, Wei
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (03) : 723 - 731
  • [33] The influence of missing value imputation on detection of differentially expressed genes from microarray data
    Scheel, I
    Aldrin, M
    Glad, IK
    Sorum, R
    Lyng, H
    Frigessi, A
    [J]. BIOINFORMATICS, 2005, 21 (23) : 4272 - 4279
  • [34] Improved KNN Imputation for Missing Values in Gene Expression Data
    Keerin, Phimmarin
    Boongoen, Tossapon
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (02): : 4009 - 4025
  • [35] Missing value imputation for microarray gene expression data using histone acetylation information
    Xiang, Qian
    Dai, Xianhua
    Deng, Yangyang
    He, Caisheng
    Wang, Jiang
    Feng, Jihua
    Dai, Zhiming
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [36] Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme
    Xian Wang
    Ao Li
    Zhaohui Jiang
    Huanqing Feng
    [J]. BMC Bioinformatics, 7
  • [37] Missing value imputation for microarray gene expression data using histone acetylation information
    Qian Xiang
    Xianhua Dai
    Yangyang Deng
    Caisheng He
    Jiang Wang
    Jihua Feng
    Zhiming Dai
    [J]. BMC Bioinformatics, 9
  • [38] An accurate and robust missing value estimation for Microarray data: least absolute deviation imputation
    Cao, Yi
    Poh, Kim Leng
    [J]. ICMLA 2006: 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2006, : 157 - +
  • [39] Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme
    Wang, X
    Li, A
    Jiang, ZH
    Feng, HQ
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [40] Cluster-based mining of microarray data in PHP/MYSQL environment
    Udoh, E.
    Bhuiyan, S.
    [J]. ADVANCES IN SYSTEMS, COMPUTING SCIENCES AND SOFTWARE ENGINEERING, 2006, : 197 - +