The Impact of Under-sampling on the Performance of Bootstrap-based Ensemble Feature Selection

被引:0
|
作者
Guney, Huseyin [1 ]
Oztoprak, Huseyin [1 ]
机构
[1] Uluslararasi Kibris Univ, Bilgisayar Muhendisligi Bolumu, Lefkosa, Turkey
关键词
Support Vector Machine (SVM); Ensemble Feature Selection; Bootstrapping; Bagging; Under-sampling; Support Vector Machine Recursive Feature Elimination (SVM-RFE);
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
DNA Microarrays are promising tool for cancer diagnosis and prognosis. DNA Microarrays are high-dimensional and gene selection is a difficult task. However, Bootstrap-based ensemble feature selection (Bagging) recently becomes popular and shows significant improvements in the field. This method aims to generate several slightly different sampled datasets, using bootstrap resampling, from training dataset. Afterwards, it aggregates all ranked feature lists, generated from sampled datasets, to obtain final (ensemble) feature list. Performance of bagging is proportional to diversity of generated sampled datasets. Therefore, it is proposed to use under-sampling of training set instead of using entire training set for bootstrap resampling to improve classification performance and gene selection stability. The proposed method was evaluated using support vector machine (SVM) as the classifier and recursive feature elimination (SVM-RFE) as the feature selection technique. Four microarray datasets were used for evaluation of the proposed method. The results show that 50% under-sampling approach have similar classification performance and outperforms conventional approach in terms of gene selection stability. In addition, 50% under-sampling uses only half of the samples of training dataset at each run of ensemble method so it has less computational cost.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] A Bootstrap-Based Iterative Selection for Ensemble Generation
    Oliveira, Dayvid V. R.
    Porpino, Thyago N.
    Cavalcanti, George D. C.
    Ren, Tsang Ing
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [2] Bootstrap-based homogeneous ensemble feature selection for network intrusion detection system
    Damtew, Yeshalem Gezahegn
    Chen, Hongmei
    Din, Burhan Mohi Yu
    [J]. DEVELOPMENTS OF ARTIFICIAL INTELLIGENCE TECHNOLOGIES IN COMPUTATION AND ROBOTICS, 2020, 12 : 27 - 34
  • [3] Ensemble based on feature projection and under-sampling for imbalanced learning
    Guo, Huaping
    Zhou, Jun
    Wu, Chang-an
    She, Wei
    Xu, Mingliang
    [J]. INTELLIGENT DATA ANALYSIS, 2018, 22 (05) : 959 - 980
  • [4] Feature Selection and Ensemble Hierarchical Cluster-based Under-sampling Approach for Extremely Imbalanced Datasets
    Soltani, Sima
    Sadri, Javad
    Torshizi, Hassan Ahmadi
    [J]. 2011 1ST INTERNATIONAL ECONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2011, : 166 - 171
  • [5] Under-Sampling and Feature Selection Algorithms for S2SMLP
    Liu, Shudong
    Zhang, Ke
    [J]. IEEE ACCESS, 2020, 8 : 191803 - 191814
  • [6] Bootstrap-based ARMA order selection
    Fenga, Livio
    Politis, Dimitris N.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2011, 81 (07) : 799 - 814
  • [7] A novel ensemble decision tree based on under-sampling and clonal selection for web spam detection
    Lu, Xiao-Yong
    Chen, Mu-Sheng
    Wu, Jheng-Long
    Chang, Pei-Chan
    Chen, Meng-Hui
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (03) : 741 - 754
  • [8] A novel ensemble decision tree based on under-sampling and clonal selection for web spam detection
    Xiao-Yong Lu
    Mu-Sheng Chen
    Jheng-Long Wu
    Pei-Chan Chang
    Meng-Hui Chen
    [J]. Pattern Analysis and Applications, 2018, 21 : 741 - 754
  • [9] EVOLUTIONARY-BASED ENSEMBLE UNDER-SAMPLING FOR IMBALANCED DATA
    Zhang, Yongqing
    Lu, Rongzhao
    Huang, Ji
    Gao, Dongrui
    [J]. 2019 16TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICWAMTIP), 2019, : 212 - 216
  • [10] Improved response modeling based on clustering, under-sampling, and ensemble
    Kang, Pilsung
    Cho, Sungzoon
    MacLachlan, Douglas L.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (08) : 6738 - 6753