Gene Selection for Microarray Cancer Classification based on Manta Rays Foraging Optimization and Support Vector Machines

被引:10
|
作者
Houssein, Essam H. [1 ]
Hassan, Hager N. [1 ]
Al-Sayed, Mustafa M. [1 ]
Nabil, Emad [2 ,3 ]
机构
[1] Minia Univ, Fac Comp & Informat, Al Minya, Egypt
[2] Cairo Univ, Fac Comp & Artificial Intelligence, Giza, Egypt
[3] Islamic Univ Madinah, Fac Comp Sci & Informat Syst, Madinah, Saudi Arabia
关键词
Microarray; Gene expression; Gene selection; Cancer classification; Feature selection; Manta Ray Foraging Optimization algorithm; Support vector machines; Minimum Redundancy Maximum Relevance; PARTICLE SWARM OPTIMIZATION; EFFICIENT FEATURE-SELECTION; FEATURE SUBSET-SELECTION; RANDOM SUBSPACE METHOD; HIGH-DIMENSIONAL DATA; MOLECULAR CLASSIFICATION; MUTUAL INFORMATION; SVM-RFE; ALGORITHM; TUMOR;
D O I
10.1007/s13369-021-06102-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In DNA microarray applications, many techniques are proposed for cancer classification in order to detect normal and cancerous humans or classify different types of cancers. Gene selection is usually required as a preliminary step for a cancer classification problem. This step aims to select the most informative genes among a great number of genes, which represent an important issue. Although many studies have been proposed to address this issue, they lack getting the most informative and fewest number of genes with the highest accuracy and little effort from the high dimensionality of microarray datasets. Manta ray foraging optimization(MRFO) algorithm is a new meta-heuristic algorithm that mimics the nature of manta ray fishes in food foraging. MRFO has achieved promising results in other fields, such as solar generating units. Due to the high accuracy results of the support vector machines (SVM), it is the most commonly used classification algorithm in cancer studies, especially with microarray data. For exploiting the pros of both algorithms (i.e., MRFO and SVM), in this paper, a hybrid algorithm is proposed to select the most predictive and informative genes for cancer classification. A binary microarray dataset, which includes colon and leukemia1, and a multi-class microarray dataset that includes SRBCT, lymphoma, and leukemia2, are used to evaluate the accuracy of the proposed technique. Like other optimization techniques, MRFO suffers from some problems related to the high dimensionality and complexity of the microarray data. For solving such problems as well as improving the performance, the minimum redundancy maximum relevance (mRMR) method is used as a preprocessing stage. The proposed technique has been evaluated compared to the most common cancer classification algorithms. The experimental results show that our proposed technique achieves the highest accuracy with the fewest number of informative genes and little effort.
引用
收藏
页码:2555 / 2572
页数:18
相关论文
共 50 条
  • [41] Combining Support Vector Machines and the t-statistic for Gene Selection in DNA Microarray Data Analysis
    Yang, Tao
    Kecman, Vojislave
    Cao, Longbing
    Zhang, Chengqi
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PROCEEDINGS, 2010, 6119 : 55 - +
  • [42] Manta Ray Foraging Optimization with Machine Learning Based Biomedical Data Classification
    Al-Rasheed, Amal
    Alzahrani, Jaber S.
    Eltahir, Majdy M.
    Mohamed, Abdullah
    Hilal, Anwer Mustafa
    Motwakel, Abdelwahed
    Zamani, Abu Sarwar
    Eldesouki, Mohamed, I
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (02): : 3275 - 3290
  • [43] Classification of Serous Ovarian Tumors Based on Microarray Data Using Multicategory Support Vector Machines
    Park, Jee Soo
    Choi, Soo Beom
    Chung, Jai Won
    Kim, Sung Woo
    Kim, Deok Won
    2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 3430 - 3433
  • [44] A HYBRID OF GENETIC ALGORITHM AND SUPPORT VECTOR MACHINE FOR FEATURES SELECTION AND CLASSIFICATION OF GENE EXPRESSION MICROARRAY
    Mohamad, Mohd Saberi
    Deris, Safaai
    Illias, Rosli Md
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2005, 5 (01) : 91 - 107
  • [45] Particle swarm optimization for linear support vector machines based classifier selection
    Garsva, Gintautas
    Danenas, Paulius
    NONLINEAR ANALYSIS-MODELLING AND CONTROL, 2014, 19 (01): : 26 - 42
  • [46] A GA-based feature selection and parameters optimization for support vector machines
    Huang, Cheng-Lung
    Wang, Chieh-Jen
    EXPERT SYSTEMS WITH APPLICATIONS, 2006, 31 (02) : 231 - 240
  • [47] An efficient ECG arrhythmia classification method based on Manta ray foraging optimization
    Houssein, Essam H.
    Ibrahim, Ibrahim E.
    Neggaz, Nabil
    Hassaballah, M.
    Wazery, Yaser M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 181
  • [48] Feature Selection for Cancer Classification Based on Support Vector Machine
    Luo, Wei
    Wang, Lipo
    Sun, Jingjing
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL IV, 2009, : 422 - +
  • [49] Gene selection for cancer classification in microarray data
    Zhang, Lijuan
    Li, Zhoujun
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2009, 46 (05): : 794 - 802
  • [50] Feature selection and classification of hyperspectral images, with support vector machines
    Archibald, Rick
    Fann, George
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2007, 4 (04) : 674 - 677