Pipelining the ranking techniques for microarray data classification: A case study

被引:16
|
作者
Dash, Rasmita [1 ]
Misra, Bijan Bihari [2 ]
机构
[1] Siksha O Anusandhan Univ, Inst Tech Educ & Res, Dept Comp Sc & Informat Technol, Bhubaneswar 751030, Odisha, India
[2] Silicon Inst Technol, Dept Comp Sc & Engn, Bhubaneswar 751024, Odisha, India
关键词
Microarray data; Feature selection; Feature ranking technique; Classification; Statistical test; DIFFERENTIALLY EXPRESSED GENES; FEATURE-SELECTION; PREDICTION; CANCER; ROBUST; OPTIMIZATION; REGRESSION; PROFILES;
D O I
10.1016/j.asoc.2016.07.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identification of relevant genes from microarray data is an apparent need in many applications. For such identification different ranking techniques with different evaluation criterion are used, which usually assign different ranks to the same gene. As a result, different techniques identify different gene subsets, which may not be the set of significant genes. To overcome such problems, in this study pipelining the ranking techniques is suggested. In each stage of pipeline, few of the lower ranked features are eliminated and at the end a relatively good subset of feature is preserved. However, the order in which the ranking techniques are used in the pipeline is important to ensure that the significant genes are preserved in the final subset. For this experimental study, twenty four unique pipeline models are generated out of four gene ranking strategies. These pipelines are tested with seven different microarray databases to find the suitable pipeline for such task. Further the gene subset obtained is tested with four classifiers and four performance metrics are evaluated. No single pipeline dominates other pipelines in performance; therefore a grading system is applied to the results of these pipelines to find out a consistent model. The finding of grading system that a pipeline model is significant is also established by Nemenyi post-hoc hypothetical test. Performance of this pipeline model is compared with four ranking techniques, though its performance is not superior always but majority of time it yields better results and can be suggested as a consistent model. However it requires more computational time in comparison to single ranking techniques. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:298 / 316
页数:19
相关论文
共 50 条
  • [21] Performance Analysis of Classification and Ranking Techniques
    Koturwar, Praful
    Girase, Sheetal
    Mukhopadhyay, Debajyoti
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [22] Variance Ranking Attributes Selection Techniques for Binary Classification Problem in Imbalance Data
    Ebenuwa, Solomon H.
    Sharif, Mhd Saeed
    Alazab, Mamoun
    Al-Nemrat, Ameer
    IEEE ACCESS, 2019, 7 : 24649 - 24666
  • [23] A Study On Classification Techniques in Data Mining
    Kesavaraj, G.
    Sukumaran, S.
    2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [24] Deep learning techniques for cancer classification using microarray gene expression data
    Gupta, Surbhi
    Gupta, Manoj K.
    Shabaz, Mohammad
    Sharma, Ashutosh
    FRONTIERS IN PHYSIOLOGY, 2022, 13
  • [25] A STUDY ON GENE SELECTION AND CLASSIFICATION ALGORITHMS FOR CLASSIFICATION OF MICROARRAY GENE EXPRESSION DATA
    Chin, Yeo Lee
    Deris, Safaai
    JURNAL TEKNOLOGI, 2005, 43
  • [26] An Empirical Study on Different Ranking Methods for Effective Data Classification
    Sangaiah, Ilangovan
    Kumar, A. V. Antony
    Balamurugan, Appavu
    JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2015, 14 (02) : 35 - 52
  • [27] Comparative Study of Classification Algorithms for Various DNA Microarray Data
    Kim, Jingeun
    Yoon, Yourim
    Park, Hye-Jin
    Kim, Yong-Hyuk
    GENES, 2022, 13 (03)
  • [28] An accelerated procedure for recursive feature ranking on microarray data
    Furlanello, C
    Serafini, M
    Merler, S
    Jurman, G
    NEURAL NETWORKS, 2003, 16 (5-6) : 641 - 648
  • [29] Ranking analysis of F-statistics for microarray data
    Yuan-De Tan
    Myriam Fornage
    Hongyan Xu
    BMC Bioinformatics, 9
  • [30] Ranking analysis of F-statistics for microarray data
    Tan, Yuan-De
    Fornage, Myriam
    Xu, Hongyan
    BMC BIOINFORMATICS, 2008, 9 (1)