A new feature selection approach based on ensemble methods in semi-supervised classification

被引:0
|
作者
Nesma Settouti
Mohamed Amine Chikh
Vincent Barra
机构
[1] LIMOS,Biomedical Engineering Laboratory GBM
[2] CNRS,undefined
[3] UMR 6158,undefined
[4] LIMOS,undefined
[5] Clermont-Université Université Blaise Pascal,undefined
[6] Tlemcen University,undefined
来源
关键词
Feature selection; Semi-supervised learning; Ensemble methods; Co-forest; Random Forest; Large datasets; Medical diagnosis;
D O I
暂无
中图分类号
学科分类号
摘要
In computer aided medical system, many practical classification applications are confronted to the massive multiplication of collection and storage of data, this is especially the case in areas such as the prediction of medical test efficiency, the classification of tumors and the detection of cancers. Data with known class labels (labeled data) can be limited but unlabeled data (with unknown class labels) are more readily available. Semi-supervised learning deals with methods for exploiting the unlabeled data in addition to the labeled data to improve performance on the classification task. In this paper, we consider the problem of using a large amount of unlabeled data to improve the efficiency of feature selection in large dimensional datasets, when only a small set of labeled examples is available. We propose a new semi-supervised feature evaluation method called Optimized co-Forest for Feature Selection (OFFS) that combines ideas from co-forest and the embedded principle of selecting in Random Forest based by the permutation of out-of-bag set. We provide empirical results on several medical and biological benchmark datasets, indicating an overall significant improvement of OFFS compared to four other feature selection approaches using filter, wrapper and embedded manner in semi-supervised learning. Our method proves its ability and effectiveness to select and measure importance to improve the performance of the hypothesis learned with a small amount of labeled samples by exploiting unlabeled samples.
引用
收藏
页码:673 / 686
页数:13
相关论文
共 50 条
  • [31] Semi-supervised feature selection for audio classification based on constraint compensated Laplacian score
    Yang, Xu-Kui
    He, Liang
    Qu, Dan
    Zhang, Wei-Qiang
    Johnson, Michael T.
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016, : 1 - 10
  • [32] Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection
    Benabdeslem, Khalid
    Elghazel, Haytham
    Hindawi, Mohammed
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 49 (03) : 1161 - 1185
  • [33] Semi-Supervised Clustering Ensemble Based on Cluster Consensus Selection
    Liu, Yanxi
    Al-Khafaji, Ali Hussein Demin
    [J]. CYBERNETICS AND SYSTEMS, 2022,
  • [34] An Ensemble of Transfer, Semi-supervised and Supervised Learning Methods for Pathological Heart Sound Classification
    Humayun, Ahmed Imtiaz
    Khan, Md. Tauhiduzzaman
    Ghaffarzadegan, Shabnam
    Feng, Zhe
    Hasan, Taufiq
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 127 - 131
  • [35] Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection
    Khalid Benabdeslem
    Haytham Elghazel
    Mohammed Hindawi
    [J]. Knowledge and Information Systems, 2016, 49 : 1161 - 1185
  • [36] Research on Semi-supervised Classification with an Ensemble Strategy
    Han, Zhanhao
    Yin, Shiqun
    [J]. PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON SENSORS, MECHATRONICS AND AUTOMATION (ICSMA 2016), 2016, 136 : 681 - 684
  • [37] Review of ensemble classification over data streams based on supervised and semi-supervised
    Han, Meng
    Li, Xiaojuan
    Wang, Le
    Zhang, Ni
    Cheng, Haodong
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 3859 - 3878
  • [38] Ensemble Projection for Semi-supervised Image Classification
    Dai, Dengxin
    Van Gool, Luc
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2072 - 2079
  • [39] A new approach for semi-supervised online news classification
    Ko, HM
    Lam, W
    [J]. WEB AND COMMUNICATION TECHNOLOGIES AND INTERNET -RELATED SOCIAL ISSUES - HSI 2005, 2005, 3597 : 238 - 247
  • [40] Manifold Based Fisher Method for Semi-Supervised Feature Selection
    Lv, Sunzhong
    Jiang, Hongxing
    Zhao, Li
    Wang, Di
    Fan, Mingyu
    [J]. 2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 664 - 668