Pipelining the ranking techniques for microarray data classification: A case study

被引:16
|
作者
Dash, Rasmita [1 ]
Misra, Bijan Bihari [2 ]
机构
[1] Siksha O Anusandhan Univ, Inst Tech Educ & Res, Dept Comp Sc & Informat Technol, Bhubaneswar 751030, Odisha, India
[2] Silicon Inst Technol, Dept Comp Sc & Engn, Bhubaneswar 751024, Odisha, India
关键词
Microarray data; Feature selection; Feature ranking technique; Classification; Statistical test; DIFFERENTIALLY EXPRESSED GENES; FEATURE-SELECTION; PREDICTION; CANCER; ROBUST; OPTIMIZATION; REGRESSION; PROFILES;
D O I
10.1016/j.asoc.2016.07.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identification of relevant genes from microarray data is an apparent need in many applications. For such identification different ranking techniques with different evaluation criterion are used, which usually assign different ranks to the same gene. As a result, different techniques identify different gene subsets, which may not be the set of significant genes. To overcome such problems, in this study pipelining the ranking techniques is suggested. In each stage of pipeline, few of the lower ranked features are eliminated and at the end a relatively good subset of feature is preserved. However, the order in which the ranking techniques are used in the pipeline is important to ensure that the significant genes are preserved in the final subset. For this experimental study, twenty four unique pipeline models are generated out of four gene ranking strategies. These pipelines are tested with seven different microarray databases to find the suitable pipeline for such task. Further the gene subset obtained is tested with four classifiers and four performance metrics are evaluated. No single pipeline dominates other pipelines in performance; therefore a grading system is applied to the results of these pipelines to find out a consistent model. The finding of grading system that a pipeline model is significant is also established by Nemenyi post-hoc hypothetical test. Performance of this pipeline model is compared with four ranking techniques, though its performance is not superior always but majority of time it yields better results and can be suggested as a consistent model. However it requires more computational time in comparison to single ranking techniques. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:298 / 316
页数:19
相关论文
共 50 条
  • [41] Recursive ECOC for microarray data classification
    Tapia, E
    Serra, E
    González, JC
    MULTIPLE CLASSIFIER SYSTEMS, 2005, 3541 : 108 - 117
  • [42] The classification of cancer stage microarray data
    Chen, Chi-Kan
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2012, 108 (03) : 1070 - 1077
  • [43] A Biologically Verified Classification of Microarray Data
    Mondal, Ritwik
    Mahata, Bholanath
    Dasgupta, Srirupa
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 686 - 690
  • [44] Learning Curves in Classification With Microarray Data
    Hess, Kenneth R.
    Wei, Caimiao
    SEMINARS IN ONCOLOGY, 2010, 37 (01) : 65 - 68
  • [45] Expectation Propagation for microarray data classification
    Hernandez-Lobato, Daniel
    Hernandez-Lobato, Jose Miguel
    Suarez, Alberto
    PATTERN RECOGNITION LETTERS, 2010, 31 (12) : 1618 - 1626
  • [46] A wavelet approach for classification of microarray data
    Prabakaran, S.
    Sahu, R.
    Verma, S.
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2008, 6 (03) : 375 - 389
  • [47] Comparative Study on Dimension Reduction Techniques for Cluster Analysis of Microarray Data
    Araujo, Daniel
    Doria Neto, Adriao
    Martins, Allan
    Melo, Jorge
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 1835 - 1842
  • [48] Empirical evaluation of data transformations and ranking statistics for microarray analysis
    Qin, LX
    Kerr, KF
    NUCLEIC ACIDS RESEARCH, 2004, 32 (18) : 5471 - 5479
  • [49] On Gene Ranking Using Replicated Microarray Time Course Data
    Tai, Yu Chuan
    Speed, Terence P.
    BIOMETRICS, 2009, 65 (01) : 40 - 51
  • [50] Theoretical and empirical analysis of filter ranking methods: Experimental study on benchmark DNA microarray data
    Ghosh, Kushal Kanti
    Begum, Shemim
    Sardar, Aritra
    Adhikary, Sukdev
    Ghosh, Manosij
    Kumar, Munish
    Sarkar, Ram
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169