Breast and Colon Cancer Classification from Gene Expression Profiles Using Data Mining Techniques

被引:30
|
作者
AbdElNabi, Mohamed Loey Ramadan [1 ]
Jasim, Mohammed Wajeeh [1 ]
EL-Bakry, Hazem M. [2 ]
Taha, Mohamed Hamed N. [3 ]
Khalifa, Nour Eldeen M. [3 ]
机构
[1] Benha Univ, Fac Comp Artificial Intelligence, Dept Comp Sci, Banha 13511, Egypt
[2] Mansoura Univ, Fac Comp & Informat Sci, Dept Informat Syst, Mansoura 35511, Egypt
[3] Cairo Univ, Fac Comp & Artificial Intelligence, Dept Informat Technol, Cairo 12613, Egypt
来源
SYMMETRY-BASEL | 2020年 / 12卷 / 03期
关键词
machine learning; cancer diagnosis; grey wolf optimization algorithm; support vector machine; information gain; feature selection; SELECTION; DIAGNOSIS; ALGORITHM; ENTROPY;
D O I
10.3390/sym12030408
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Early detection of cancer increases the probability of recovery. This paper presents an intelligent decision support system (IDSS) for the early diagnosis of cancer based on gene expression profiles collected using DNA microarrays. Such datasets pose a challenge because of the small number of samples (no more than a few hundred) relative to the large number of genes (in the order of thousands). Therefore, a method of reducing the number of features (genes) that are not relevant to the disease of interest is necessary to avoid overfitting. The proposed methodology uses the information gain (IG) to select the most important features from the input patterns. Then, the selected features (genes) are reduced by applying the grey wolf optimization (GWO) algorithm. Finally, the methodology employs a support vector machine (SVM) classifier for cancer type classification. The proposed methodology was applied to two datasets (Breast and Colon) and was evaluated based on its classification accuracy, which is the most important performance measure in disease diagnosis. The experimental results indicate that the proposed methodology is able to enhance the stability of the classification accuracy as well as the feature selection.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Mining Associations Between Genes and Clinical Conditions of Breast Cancer by Using Gene Expression Data
    Hou, Yuefang
    Zhang, Hao
    Zhang, Han
    Cui, Lei
    MEDINFO 2017: PRECISION HEALTHCARE THROUGH INFORMATICS, 2017, 245 : 1267 - 1267
  • [22] Prediction of benign and malignant breast cancer using data mining techniques
    Chaurasia, Vikas
    Pal, Saurabh
    Tiwari, B. B.
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2018, 12 (02) : 119 - 126
  • [23] A Review on Prediction Of Breast Cancer Using Various Data Mining Techniques
    Deepika, M.
    Gladence, L. Mary
    Keerthana, R. Madhu
    RESEARCH JOURNAL OF PHARMACEUTICAL BIOLOGICAL AND CHEMICAL SCIENCES, 2016, 7 (01): : 808 - 814
  • [24] Classification of Anemia Using Data Mining Techniques
    Sanap, Shilpa A.
    Nagori, Meghana
    Kshirsagar, Vivek
    SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, PT II, 2011, 7077 : 113 - +
  • [25] Intelligent Breast Cancer Prediction Model Using Data Mining Techniques
    Shen, Runjie
    Yang, Yuanyuan
    Shao, Fengfeng
    2014 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL 1, 2014, : 384 - 387
  • [26] Earthquakes classification using data mining techniques
    Rodriguez-Elizalde, J
    Figueroa-Nazuno, J
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTING TECHNIQUES, 2004, : 257 - 261
  • [27] Classification of Wines Using Data Mining Techniques
    Ribeiro, Jorge
    Neves, Jose
    Sanchez, Juan
    NOVAS PERSPECTIVAS EM SISTEMAS E TECNOLOGIAS DE INFORMACAO, VOL II, 2007, : 183 - 191
  • [28] Breast cancer classification and prognosis based on gene expression profiles from a population-based study
    Sotiriou, C
    Neo, SY
    McShane, LM
    Korn, EL
    Long, PM
    Jazaeri, A
    Martiat, P
    Fox, SB
    Harris, AL
    Liu, ET
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (18) : 10393 - 10398
  • [29] Mining gene expression profiles: expression signatures as cancer phenotypes
    Joseph R. Nevins
    Anil Potti
    Nature Reviews Genetics, 2007, 8 : 601 - 609
  • [30] Mining gene expression profiles: expression signatures as cancer phenotypes
    Nevins, Joseph R.
    Potti, Anil
    NATURE REVIEWS GENETICS, 2007, 8 (08) : 601 - 609