Pipelining the ranking techniques for microarray data classification: A case study

被引：16

作者：

Dash, Rasmita ^{[1
]}

Misra, Bijan Bihari ^{[2
]}

机构：

[1] Siksha O Anusandhan Univ, Inst Tech Educ & Res, Dept Comp Sc & Informat Technol, Bhubaneswar 751030, Odisha, India

[2] Silicon Inst Technol, Dept Comp Sc & Engn, Bhubaneswar 751024, Odisha, India

来源：

APPLIED SOFT COMPUTING | 2016年 / 48卷

关键词：

Microarray data; Feature selection; Feature ranking technique; Classification; Statistical test; DIFFERENTIALLY EXPRESSED GENES; FEATURE-SELECTION; PREDICTION; CANCER; ROBUST; OPTIMIZATION; REGRESSION; PROFILES;

D O I：

10.1016/j.asoc.2016.07.006

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Identification of relevant genes from microarray data is an apparent need in many applications. For such identification different ranking techniques with different evaluation criterion are used, which usually assign different ranks to the same gene. As a result, different techniques identify different gene subsets, which may not be the set of significant genes. To overcome such problems, in this study pipelining the ranking techniques is suggested. In each stage of pipeline, few of the lower ranked features are eliminated and at the end a relatively good subset of feature is preserved. However, the order in which the ranking techniques are used in the pipeline is important to ensure that the significant genes are preserved in the final subset. For this experimental study, twenty four unique pipeline models are generated out of four gene ranking strategies. These pipelines are tested with seven different microarray databases to find the suitable pipeline for such task. Further the gene subset obtained is tested with four classifiers and four performance metrics are evaluated. No single pipeline dominates other pipelines in performance; therefore a grading system is applied to the results of these pipelines to find out a consistent model. The finding of grading system that a pipeline model is significant is also established by Nemenyi post-hoc hypothetical test. Performance of this pipeline model is compared with four ranking techniques, though its performance is not superior always but majority of time it yields better results and can be suggested as a consistent model. However it requires more computational time in comparison to single ranking techniques. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：298 / 316

页数：19

共 50 条

[21] Performance Analysis of Classification and Ranking Techniques
Koturwar, Praful
Girase, Sheetal
Mukhopadhyay, Debajyoti
2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
[22] Variance Ranking Attributes Selection Techniques for Binary Classification Problem in Imbalance Data
Ebenuwa, Solomon H.
Sharif, Mhd Saeed
Alazab, Mamoun
Al-Nemrat, Ameer
IEEE ACCESS, 2019, 7 : 24649 - 24666
[23] A Study On Classification Techniques in Data Mining
Kesavaraj, G.
Sukumaran, S.
2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
[24] Deep learning techniques for cancer classification using microarray gene expression data
Gupta, Surbhi
Gupta, Manoj K.
Shabaz, Mohammad
Sharma, Ashutosh
FRONTIERS IN PHYSIOLOGY, 2022, 13
[25] A STUDY ON GENE SELECTION AND CLASSIFICATION ALGORITHMS FOR CLASSIFICATION OF MICROARRAY GENE EXPRESSION DATA
Chin, Yeo Lee
Deris, Safaai
JURNAL TEKNOLOGI, 2005, 43
[26] An Empirical Study on Different Ranking Methods for Effective Data Classification
Sangaiah, Ilangovan
Kumar, A. V. Antony
Balamurugan, Appavu
JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2015, 14 (02) : 35 - 52
[27] Comparative Study of Classification Algorithms for Various DNA Microarray Data
Kim, Jingeun
Yoon, Yourim
Park, Hye-Jin
Kim, Yong-Hyuk
GENES, 2022, 13 (03)
[28] An accelerated procedure for recursive feature ranking on microarray data
Furlanello, C
Serafini, M
Merler, S
Jurman, G
NEURAL NETWORKS, 2003, 16 (5-6) : 641 - 648
[29] Ranking analysis of F-statistics for microarray data
Yuan-De Tan
Myriam Fornage
Hongyan Xu
BMC Bioinformatics, 9
[30] Ranking analysis of F-statistics for microarray data
Tan, Yuan-De
Fornage, Myriam
Xu, Hongyan
BMC BIOINFORMATICS, 2008, 9 (1)

← 1 2 3 4 5 →