An Ensemble Classification Method for High-Dimensional Data Using Neighborhood Rough Set

被引:1
|
作者
Zhang, Jing [1 ]
Lu, Guang [1 ]
Li, Jiaquan [1 ]
Li, Chuanwen [1 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110004, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1155/2021/8358921
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Mining useful knowledge from high-dimensional data is a hot research topic. Efficient and effective sample classification and feature selection are challenging tasks due to high dimensionality and small sample size of microarray data. Feature selection is necessary in the process of constructing the model to reduce time and space consumption. Therefore, a feature selection model based on prior knowledge and rough set is proposed. Pathway knowledge is used to select feature subsets, and rough set based on intersection neighborhood is then used to select important feature in each subset, since it can select features without redundancy and deals with numerical features directly. In order to improve the diversity among base classifiers and the efficiency of classification, it is necessary to select part of base classifiers. Classifiers are grouped into several clusters by k-means clustering using the proposed combination distance of Kappa-based diversity and accuracy. The base classifier with the best classification performance in each cluster will be selected to generate the final ensemble model. Experimental results on three Arabidopsis thaliana stress response datasets showed that the proposed method achieved better classification performance than existing ensemble models.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Ensemble Method for Classification of High-Dimensional Data
    Piao, Yongjun
    Park, Hyun Woo
    Jin, Cheng Hao
    Ryu, Keun Ho
    [J]. 2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 245 - +
  • [2] Detecting Overlapping Areas in Unbalanced High-Dimensional Data Using Neighborhood Rough Set and Genetic Programming
    Pei, Wenbin
    Xue, Bing
    Shang, Lin
    Zhang, Mengjie
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2023, 27 (04) : 1130 - 1144
  • [3] A novel ensemble method for high-dimensional genomic data classification
    Espichan, Alexandra
    Villanueva, Edwin
    [J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2229 - 2236
  • [4] An Ensemble Method for High-Dimensional Multilabel Data
    Liu, Huawen
    Zheng, Zhonglong
    Zhao, Jianmin
    Ye, Ronghua
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [5] A New Ensemble Method with Feature Space Partitioning for High-Dimensional Data Classification
    Piao, Yongjun
    Piao, Minghao
    Jin, Cheng Hao
    Shon, Ho Sun
    Chung, Ji-Moon
    Hwang, Buhyun
    Ryu, Keun Ho
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [6] Adaptive Subspace Optimization Ensemble Method for High-Dimensional Imbalanced Data Classification
    Xu, Yuhong
    Yu, Zhiwen
    Chen, C. L. Philip
    Liu, Zhulin
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (05) : 2284 - 2297
  • [7] Online Streaming Feature Selection for High-Dimensional and Class-Imbalanced Data Based on Neighborhood Rough Set
    Chen, Xiangyan
    Lin, Yaojin
    Wang, Chenxi
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (08): : 726 - 735
  • [8] Adaptive Classifier Ensemble Method Based on Spatial Perception for High-Dimensional Data Classification
    Xu, Yuhong
    Yu, Zhiwen
    Cao, Wenming
    Chen, C. L. Philip
    You, Jane
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (07) : 2847 - 2862
  • [9] A Novel Classifier Ensemble Method Based on Subspace Enhancement for High-Dimensional Data Classification
    Xu, Yuhong
    Yu, Zhiwen
    Cao, Wenming
    Chen, C. L. Philip
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 16 - 30
  • [10] Ensemble of penalized logistic models for classification of high-dimensional data
    Ijaz, Musarrat
    Asghar, Zahid
    Gul, Asma
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (07) : 2072 - 2088