Ensemble Feature Selection for Breast Cancer Classification using Microarray Data

被引:5
|
作者
Hengpraprohm, Supoj [1 ]
Jungjit, Suwimol [2 ]
机构
[1] Nakhon Pathom Rajabhat Univ, Fac Sci & Technol, Data Sci Program, Muang, Nakhon Pathom, Thailand
[2] Thaksin Univ, Fac Sci, Dept Comp & Informat Technol, Phatthalung, Thailand
关键词
Ensemble approach; Feature selection; Microarray data; Genetic Algorithm; Cancer Classification; IDENTIFICATION;
D O I
10.4114/intartif.vol23iss65pp100-114
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an ensemble filter feature selection approach, EnSNR, for breast cancer data classification. The Microarray dataset used in the experiments contains 50,739 features (genes) for each of 32 patients. The main idea of the EnSNR approach is to combine informative features which are obtained using two different sets of feature evaluation criteria. Features in the EnSNR subset are those features which are present in both sets of evaluation results. Entropy and SNR evaluation functions are used to generate the EnSNR feature subset. Entropy is a measure of the amount of uncertainty in the outcome of a random experiment, while SNR is an effective function for measuring feature discriminative power. Entropy and SNR functions provide some advantages for the EnSNR approach. For example, the number of features in the EnSNR subset is not user-defined (the EnSNR subset is generated automatically); and the operation of the EnSNR function is independent of the type of classification algorithm employed. Also, only a small amount of processing time is required to generate the EnSNR feature subset. A Genetic Algorithm (GA) generates the breast cancer classification 'model' using the EnSNR feature subset. The efficiency of the 'model' is validated using 10-Fold Cross-Validation re-sampling. When the `EnSNR' feature subset is used, as well as giving a high degree of prediction accuracy (the average prediction accuracy obtained in the experiments in this paper is 86.92 +/- 5.47), the EnSNR approach significantly reduces the number of irrelevant features (genes) to be analyzed for cancer classification.
引用
收藏
页码:100 / 114
页数:15
相关论文
共 50 条
  • [1] Iterative ensemble feature selection for multiclass classification of imbalanced microarray data
    Yang, Junshan
    Zhou, Jiarui
    Zhu, Zexuan
    Ma, Xiaoliang
    Ji, Zhen
    [J]. JOURNAL OF BIOLOGICAL RESEARCH-THESSALONIKI, 2016, 23
  • [2] Feature Selection for Cancer Classification on Microarray Expression Data
    Hsu, Hui-Huang
    Lu, Ming-Da
    [J]. ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 3, PROCEEDINGS, 2008, : 153 - 158
  • [3] Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data
    Wang, Aiguo
    Liu, Huancheng
    Yang, Jing
    Chen, Guilin
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 142
  • [4] Microarray Lung Cancer Data Classification Using Similarity based Feature Selection
    Amrane, Meriem
    Oukid, Saliha
    Ensari, Tolga
    Benblidia, Nadjia
    Orman, Zeynep
    [J]. 2019 SCIENTIFIC MEETING ON ELECTRICAL-ELECTRONICS & BIOMEDICAL ENGINEERING AND COMPUTER SCIENCE (EBBT), 2019,
  • [5] Classification of microarray cancer data using ensemble approach
    Nagi S.
    Bhattacharyya D.K.
    [J]. Network Modeling Analysis in Health Informatics and Bioinformatics, 2013, 2 (3) : 159 - 173
  • [6] An enhanced feature selection filter for classification of microarray cancer data
    Mazumder, Dilwar Hussain
    Veilumuthu, Ramachandran
    [J]. ETRI JOURNAL, 2019, 41 (03) : 358 - 370
  • [7] Feature selection using differential evolution for microarray data classification
    Prajapati S.
    Das H.
    Gourisaria M.K.
    [J]. Discover Internet of Things, 2023, 3 (01):
  • [8] Parallel classification and feature selection in microarray data using SPRINT
    Mitchell, Lawrence
    Sloan, Terence M.
    Mewissen, Muriel
    Ghazal, Peter
    Forster, Thorsten
    Piotrowski, Michal
    Trew, Arthur
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (04): : 854 - 865
  • [9] Cancer Classification through Feature Selection and Transductive SVM Using Gene Microarray Data
    Chakraborty, Debasis
    Das, Shibu
    [J]. 2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 77 - 80
  • [10] Efficient feature selection and classification for microarray data
    Li, Zifa
    Xie, Weibo
    Liu, Tao
    [J]. PLOS ONE, 2018, 13 (08):