An Integrated Feature Selection Algorithm for Cancer Classification using Gene Expression Data

被引:23
|
作者
Ahmed, Saeed [1 ]
Kabir, Muhammad [1 ]
Ali, Zakir [1 ]
Arif, Muhammad [1 ]
Ali, Farman [1 ]
Yu, Dong-Jun [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Cancer classification; gene expression data; correlation-based feature selection; multi-objective evolutionary algorithm; redial base function neural network; MOLECULAR CLASSIFICATION; MICROARRAY DATA; PROTEIN TYPES; IDENTIFICATION; PREDICTION; TUMOR; TISSUES; MODEL;
D O I
10.2174/1386207322666181220124756
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Aim and Objective: Cancer is a dangerous disease worldwide, caused by somatic mutations in the genome. Diagnosis of this deadly disease at an early stage is exceptionally new clinical application of microarray data. In DNA microarray technology, gene expression data have a high dimension with small sample size. Therefore, the development of efficient and robust feature selection methods is indispensable that identify a small set of genes to achieve better classification performance. Materials and Methods: In this study, we developed a hybrid feature selection method that integrates correlation-based feature selection (CFS) and Multi-Objective Evolutionary Algorithm (MOEA) approaches which select the highly informative genes. The hybrid model with Redial base function neural network (RBFNN) classifier has been evaluated on 11 benchmark gene expression datasets by employing a 10-fold cross-validation test. Results: The experimental results are compared with seven conventional-based feature selection and other methods in the literature, which shows that our approach owned the obvious merits in the aspect of classification accuracy ratio and some genes selected by extensive comparing with other methods. Conclusion: Our proposed CFS-MOEA algorithm attained up to 100% classification accuracy for six out of eleven datasets with a minimal sized predictive gene subset.
引用
收藏
页码:631 / 645
页数:15
相关论文
共 50 条
  • [31] A Comparison of Feature Selection Algorithms for Cancer Classification Through Gene Expression Data: Leukemia Case
    Tasci, Asli
    Ince, Turker
    Guzelis, Cuneyt
    [J]. 2017 10TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2017, : 1352 - 1354
  • [32] A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data
    Dara S.
    Banka H.
    Annavarapu C.S.R.
    [J]. Annals of Data Science, 2017, 4 (3) : 341 - 360
  • [33] The ant colony algorithm for feature selection in high-dimension gene expression data for disease classification
    Robbins, K. R.
    Zhang, W.
    Bertrand, J. K.
    Rekaya, R.
    [J]. MATHEMATICAL MEDICINE AND BIOLOGY-A JOURNAL OF THE IMA, 2007, 24 (04): : 413 - 426
  • [34] An efficient statistical feature selection approach for classification of gene expression data
    Chandra, B.
    Gupta, Manish
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2011, 44 (04) : 529 - 535
  • [35] Classification of Gene Expression Data Using Feature Selection Based on Type Combination Approach Model With Advanced Feature Selection Technology
    Siddesh, G. M.
    Gururaj, T.
    [J]. INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2021, 15 (04)
  • [36] Gene Microarray Cancer Classification using Correlation Based Feature Selection Algorithm and Rules Classifiers
    Al-Batah, Mohammad
    Zaqaibeh, Belal
    Alomari, Saleh Ali
    Alzboon, Mowafaq Salem
    [J]. INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2019, 15 (08) : 62 - 73
  • [37] Semi-supervised SVM-based Feature Selection for Cancer Classification using Microarray Gene Expression Data
    Ang, Jun Chin
    Haron, Habibollah
    Hamed, Haza Nuzly Abdull
    [J]. CURRENT APPROACHES IN APPLIED ARTIFICIAL INTELLIGENCE, 2015, 9101 : 468 - 477
  • [38] FCM-SVM-RFE gene feature selection algorithm for leukemia classification from microarray gene expression data
    Tang, YC
    Zhang, YQ
    Huang, Z
    [J]. FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 97 - 101
  • [39] Feature selection using neighborhood uncertainty measures and Fisher score for gene expression data classification
    Xu, Jiucheng
    Qu, Kanglin
    Qu, Kangjian
    Hou, Qincheng
    Meng, Xiangru
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (12) : 4011 - 4028
  • [40] Feature selection and ranking of key genes for tumor classification: Using microarray gene expression data
    Mukkamala, Srinivas
    Liu, Qingzhong
    Veeraghattam, Rajeev
    Sung, Andrew H.
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2006, PROCEEDINGS, 2006, 4029 : 951 - 961