An Ensemble Classification Method for High-Dimensional Data Using Neighborhood Rough Set

被引:1
|
作者
Zhang, Jing [1 ]
Lu, Guang [1 ]
Li, Jiaquan [1 ]
Li, Chuanwen [1 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1155/2021/8358921
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Mining useful knowledge from high-dimensional data is a hot research topic. Efficient and effective sample classification and feature selection are challenging tasks due to high dimensionality and small sample size of microarray data. Feature selection is necessary in the process of constructing the model to reduce time and space consumption. Therefore, a feature selection model based on prior knowledge and rough set is proposed. Pathway knowledge is used to select feature subsets, and rough set based on intersection neighborhood is then used to select important feature in each subset, since it can select features without redundancy and deals with numerical features directly. In order to improve the diversity among base classifiers and the efficiency of classification, it is necessary to select part of base classifiers. Classifiers are grouped into several clusters by k-means clustering using the proposed combination distance of Kappa-based diversity and accuracy. The base classifier with the best classification performance in each cluster will be selected to generate the final ensemble model. Experimental results on three Arabidopsis thaliana stress response datasets showed that the proposed method achieved better classification performance than existing ensemble models.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] A classification method for high-dimensional imbalanced multi-classification data
    Li, Mengmeng
    Zheng, Qibin
    Liu, Yi
    Li, Gengsong
    Qin, Wei
    Ren, Xiaoguang
    [J]. ELECTRONICS LETTERS, 2023, 59 (20)
  • [22] An Efficient Extraction-based Bagging Ensemble for High-dimensional data classification
    Huang, Hsiao-Yun
    Li, Yen-Chieh
    [J]. 6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1557 - 1560
  • [23] Classifier Ensemble Based on Multiview Optimization for High-Dimensional Imbalanced Data Classification
    Xu, Yuhong
    Yu, Zhiwen
    Chen, C. L. Philip
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 870 - 883
  • [24] Feature Subset Selection Approach Based on Fuzzy Rough Set for High-dimensional Data
    Guo, Changyou
    Zheng, Xuefeng
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2014, : 72 - 75
  • [25] An Efficient and Versatile Variational Method for High-Dimensional Data Classification
    Cai, Xiaohao
    Chan, Raymond H.
    Xie, Xiaoyu
    Zeng, Tieyong
    [J]. JOURNAL OF SCIENTIFIC COMPUTING, 2024, 100 (03)
  • [26] A novel ensemble classification of gene expression profile based on bagging and neighborhood rough set
    Chen, Tao
    Hong, Zenglin
    Zhao, Hui
    [J]. Journal of Computational Information Systems, 2015, 11 (08): : 2747 - 2754
  • [27] A classification algorithm for high-dimensional data
    Roy, Asim
    [J]. INNS CONFERENCE ON BIG DATA 2015 PROGRAM, 2015, 53 : 345 - 355
  • [28] Agricultural Data Classification Based on Rough Set and Decision Tree Ensemble
    Shi, Lei
    Ma, Xinming
    Duan, Qiguo
    Weng, Mei
    Qiao, Hongbo
    [J]. SENSOR LETTERS, 2012, 10 (1-2) : 271 - 278
  • [29] Rough Set based Ensemble Learning Algorithm for Agricultural Data Classification
    Shi, Lei
    Duan, Qiguo
    Zhang, Juanjuan
    Xi, Lei
    Qiao, Hongbo
    Ma, Xinming
    [J]. FILOMAT, 2018, 32 (05) : 1917 - 1930
  • [30] Improved Contraction-Expansion Subspace Ensemble for High-Dimensional Imbalanced Data Classification
    Xu, Yuhong
    Yu, Zhiwen
    Chen, C. L. Philip
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (10) : 5194 - 5205