GECC: Gene Expression Based Ensemble Classification of Colon Samples

被引:25
|
作者
Rathore, Saima [1 ,2 ]
Hussain, Mutawarra [1 ]
Khan, Asifullah [1 ]
机构
[1] Pakistan Inst Engn & Appl Sci, Dept Comp & Informat Sci, Islamabad, Pakistan
[2] Univ Azad Jammu & Kashmir, Dept Comp Sci & Informat Technol, Muzaffarabad, Pakistan
关键词
Colon cancer; ensemble classification; gene expressions; PCA; mRMR; F-Score; chi-square; FEATURE-SELECTION; CANCER; PREDICTION; MACHINE; PROFILES; TUMOR; SVM; INFORMATION; PATTERNS;
D O I
10.1109/TCBB.2014.2344655
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Gene expression deviates from its normal composition in case a patient has cancer. This variation can be used as an effective tool to find cancer. In this study, we propose a novel gene expressions based colon classification scheme (GECC) that exploits the variations in gene expressions for classifying colon gene samples into normal and malignant classes. Novelty of GECC is in two complementary ways. First, to cater overwhelmingly larger size of gene based data sets, various feature extraction strategies, like, chi-square, F-Score, principal component analysis (PCA) and minimum redundancy and maximum relevancy (mRMR) have been employed, which select discriminative genes amongst a set of genes. Second, a majority voting based ensemble of support vector machine (SVM) has been proposed to classify the given gene based samples. Previously, individual SVM models have been used for colon classification, however, their performance is limited. In this research study, we propose an SVM-ensemble based new approach for gene based classification of colon, wherein the individual SVM models are constructed through the learning of different SVM kernels, like, linear, polynomial, radial basis function (RBF), and sigmoid. The predicted results of individual models are combined through majority voting. In this way, the combined decision space becomes more discriminative. The proposed technique has been tested on four colon, and several other binary-class gene expression data sets, and improved performance has been achieved compared to previously reported gene based colon cancer detection techniques. The computational time required for the training and testing of 208 x 5,851 data set has been 591.01 and 0.019 s, respectively.
引用
收藏
页码:1131 / 1145
页数:15
相关论文
共 50 条
  • [1] Cancer Classification Ensemble System Based on Gene Expression Profiles
    Tarek, Sara
    Elwahab, Reda Abd
    Shoman, Mahmoud
    [J]. 2016 5TH INTERNATIONAL CONFERENCE ON ELECTRONIC DEVICES, SYSTEMS AND APPLICATIONS (ICEDSA), 2016,
  • [2] Ensemble classification for gene expression data based on parallel clustering
    Meng, Jun
    Jiang, Dingling
    Zhang, Jing
    Luan, Yushi
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2018, 20 (03) : 213 - 229
  • [3] An ensemble approach for phenotype classification based on fuzzy partitioning of gene expression data
    Dragomir, A.
    Maraziotis, I.
    Bezerianos, A.
    [J]. 2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15, 2006, : 1930 - +
  • [4] Ensemble based fuzzy weighted extreme learning machine for gene expression classification
    Wang, Yang
    Wang, Anna
    Ai, Qing
    Sun, Haijing
    [J]. APPLIED INTELLIGENCE, 2019, 49 (03) : 1161 - 1171
  • [5] Dissimilarity based ensemble of extreme learning machine for gene expression data classification
    Lu, Hui-juan
    An, Chun-lin
    Zheng, En-hui
    Lu, Yi
    [J]. NEUROCOMPUTING, 2014, 128 : 22 - 30
  • [6] Ensemble based fuzzy weighted extreme learning machine for gene expression classification
    Yang Wang
    Anna Wang
    Qing Ai
    Haijing Sun
    [J]. Applied Intelligence, 2019, 49 : 1161 - 1171
  • [7] An ensemble filter-based heuristic approach for cancerous gene expression classification
    Uzma
    Halim, Zahid
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 234
  • [8] An ensemble correlation-based gene selection algorithm for cancer classification with gene expression data
    Piao, Yongjun
    Piao, Minghao
    Park, Kiejung
    Ryu, Keun Ho
    [J]. BIOINFORMATICS, 2012, 28 (24) : 3306 - 3315
  • [9] Ensemble of dissimilarity based classifiers for cancerous samples classification
    Blanco, Angela
    Martin-Merino, Manuel
    de las Rivas, Javier
    [J]. PATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS, 2007, 4774 : 178 - 188
  • [10] Cancer Classification from Gene Expression Based Microarray Data Using SVM Ensemble
    Begum, Shemim
    Chakraborty, Debasis
    Sarkar, Ram
    [J]. 2015 International Conference on Condition Assessment Techniques in Electrical Systems (CATCON), 2015, : 13 - 16