Incremental feature selection approach to multi-dimensional variation based on matrix dominance conditional entropy for ordered data set

被引:1
|
作者
Xu, Weihua [1 ]
Yang, Yifei [1 ]
Ding, Yi [1 ]
Chen, Xiyang [2 ]
Lv, Xiaofang [3 ]
机构
[1] Southwest Univ, Coll Artificial Intelligence, Chongqing 400715, Peoples R China
[2] Xian Univ Sci & Technol, Coll Comp Sci & Technol, Xian 710600, Peoples R China
[3] Southwest Univ, Coll Life Sci, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Conditional entropy; Dominance matrix; Feature selection; Ordered data set; Rough set; ATTRIBUTE REDUCTION; DYNAMIC DATA; LEARNING ALGORITHM; ROUGH SETS;
D O I
10.1007/s10489-024-05411-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rough set theory is a mathematical tool widely employed in various fields to handle uncertainty. Feature selection, as an essential and independent research area within rough set theory, aims to identify a small subset of important features by eliminating irrelevant, redundant, or noisy ones. In human life, data characteristics constantly change over time and other factors, resulting in ordered datasets with varying features. However, existing feature extraction methods are not suitable for handling such datasets since they do not consider previous reduction results when features change and need to be recomputed, leading to significant time consumption. To address this issue, the incremental attribute reduction algorithm utilizes prior reduction results effectively reducing computation time. Motivated by this approach, this paper investigates incremental feature selection algorithms for ordered datasets with changing features. Firstly, we discuss the dominant matrix and the dominance conditional entropy while introducing update principles for the new dominant matrix and dominance diagonal matrix when features change. Subsequently, we propose two incremental feature selection algorithms for adding (IFS-A) or deleting (IFS-D) features in ordered data set. Additionally, nine UCI datasets are utilized to evaluate the performance of our proposed algorithm. The experimental results validate that the average classification accuracy of IFS-A and IFS-D under four classifiers on twelve datasets is 82.05% and 80.75%, which increases by 5.48% and 3.68% respectively compared with the original data.
引用
收藏
页码:4890 / 4910
页数:21
相关论文
共 50 条
  • [1] Matrix representation of the conditional entropy for incremental feature selection on multi-source data
    Huang, Yanyong
    Guo, Kejun
    Yi, Xiuwen
    Li, Zhong
    Li, Tianrui
    INFORMATION SCIENCES, 2022, 591 : 263 - 286
  • [2] Matrix-based feature selection approach using conditional entropy for ordered data set with time-evolving features
    Xu, Weihua
    Yang, Yifei
    KNOWLEDGE-BASED SYSTEMS, 2023, 279
  • [3] Incremental Feature Selection Using a Conditional Entropy Based on Fuzzy Dominance Neighborhood Rough Sets
    Sang, Binbin
    Chen, Hongmei
    Yang, Lei
    Li, Tianrui
    Xu, Weihua
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (06) : 1683 - 1697
  • [4] Incremental neighborhood entropy-based feature selection for mixed-type data under the variation of feature set
    Shu, Wenhao
    Qian, Wenbin
    Xie, Yonghong
    APPLIED INTELLIGENCE, 2022, 52 (05) : 4792 - 4806
  • [5] Incremental neighborhood entropy-based feature selection for mixed-type data under the variation of feature set
    Wenhao Shu
    Wenbin Qian
    Yonghong Xie
    Applied Intelligence, 2022, 52 : 4792 - 4806
  • [6] Feature selection of dominance-based neighborhood rough set approach for processing hybrid ordered data
    Chen, Jiayue
    Zhu, Ping
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2024, 167
  • [7] An incremental feature selection approach based on information entropy for incomplete data
    Luo, Chuan
    Li, Tianrui
    Yi, Zhang
    IEEE 17TH INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP / IEEE 17TH INT CONF ON PERVAS INTELLIGENCE AND COMP / IEEE 5TH INT CONF ON CLOUD AND BIG DATA COMP / IEEE 4TH CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2019, : 483 - 488
  • [8] A Novel Online Multi-label Feature Selection Approach for Multi-dimensional Streaming Data
    Zhang, Zhanyun
    Luo, Chuan
    Li, Tianrui
    Chen, Hongmei
    Liu, Dun
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 159 - 171
  • [9] Efficient updating rough approximations with multi-dimensional variation of ordered data
    Wang, Shu
    Li, Tianrui
    Luo, Chuan
    Fujita, Hamido
    INFORMATION SCIENCES, 2016, 372 : 690 - 708
  • [10] Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set
    Sang, Binbin
    Chen, Hongmei
    Yang, Lei
    Li, Tianrui
    Xu, Weihua
    Luo, Chuan
    KNOWLEDGE-BASED SYSTEMS, 2021, 227