An Integrated Feature Selection Algorithm for Cancer Classification using Gene Expression Data

被引:23
|
作者
Ahmed, Saeed [1 ]
Kabir, Muhammad [1 ]
Ali, Zakir [1 ]
Arif, Muhammad [1 ]
Ali, Farman [1 ]
Yu, Dong-Jun [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Cancer classification; gene expression data; correlation-based feature selection; multi-objective evolutionary algorithm; redial base function neural network; MOLECULAR CLASSIFICATION; MICROARRAY DATA; PROTEIN TYPES; IDENTIFICATION; PREDICTION; TUMOR; TISSUES; MODEL;
D O I
10.2174/1386207322666181220124756
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Aim and Objective: Cancer is a dangerous disease worldwide, caused by somatic mutations in the genome. Diagnosis of this deadly disease at an early stage is exceptionally new clinical application of microarray data. In DNA microarray technology, gene expression data have a high dimension with small sample size. Therefore, the development of efficient and robust feature selection methods is indispensable that identify a small set of genes to achieve better classification performance. Materials and Methods: In this study, we developed a hybrid feature selection method that integrates correlation-based feature selection (CFS) and Multi-Objective Evolutionary Algorithm (MOEA) approaches which select the highly informative genes. The hybrid model with Redial base function neural network (RBFNN) classifier has been evaluated on 11 benchmark gene expression datasets by employing a 10-fold cross-validation test. Results: The experimental results are compared with seven conventional-based feature selection and other methods in the literature, which shows that our approach owned the obvious merits in the aspect of classification accuracy ratio and some genes selected by extensive comparing with other methods. Conclusion: Our proposed CFS-MOEA algorithm attained up to 100% classification accuracy for six out of eleven datasets with a minimal sized predictive gene subset.
引用
收藏
页码:631 / 645
页数:15
相关论文
共 50 条
  • [1] Feature Selection and Classification in gene expression cancer data
    Pavithra, D.
    Lakshmanan, B.
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS), 2017,
  • [2] A hybrid feature selection algorithm for gene expression data classification
    Lu, Huijuan
    Chen, Junying
    Yan, Ke
    Jin, Qun
    Xue, Yu
    Gao, Zhigang
    [J]. NEUROCOMPUTING, 2017, 256 : 56 - 62
  • [3] A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data
    Wang, Hong
    Jing, Xingjian
    Niu, Ben
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 126 : 8 - 19
  • [4] Gene expression data classification using genetic algorithm-based feature selection
    Sonmez, Oznur Sinem
    Dagtekin, Mustafa
    Ensari, Tolga
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (07) : 3165 - 3179
  • [5] Feature Selection of Gene Expression Data for Cancer Classification: A Review
    Singh, Rabindra Kumar
    Sivabalakrishnan, M.
    [J]. BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 52 - 57
  • [6] Unsupervised feature selection algorithm for multiclass cancer classification of gene expression RNA-Seq data
    Garcia-Diaz, Pilar
    Sanchez-Berriel, Isabel
    Martinez-Rojas, Juan A.
    Diez-Pascual, Ana M.
    [J]. GENOMICS, 2020, 112 (02) : 1916 - 1925
  • [7] Feature Selection and Classification for Gene Expression Data using Evolutionary Computation
    Banka, Haider
    Dara, Suresh
    [J]. 2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 185 - 189
  • [8] Variance-based Feature Selection for Classification of Cancer Subtypes Using Gene Expression Data
    Roberts, Aedan G. K.
    Catchpoole, Daniel R.
    Kennedy, Paul J.
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [9] Feature selection of gene expression data for Cancer classification using double RBF-kernels
    Liu, Shenghui
    Xu, Chunrui
    Zhang, Yusen
    Liu, Jiaguo
    Yu, Bin
    Liu, Xiaoping
    Dehmer, Matthias
    [J]. BMC BIOINFORMATICS, 2018, 19
  • [10] Feature selection of gene expression data for Cancer classification using double RBF-kernels
    Shenghui Liu
    Chunrui Xu
    Yusen Zhang
    Jiaguo Liu
    Bin Yu
    Xiaoping Liu
    Matthias Dehmer
    [J]. BMC Bioinformatics, 19