Information-theoretic partially labeled heterogeneous feature selection based on neighborhood rough sets

被引:14
|
作者
Zhang, Hongying [1 ]
Sun, Qianqian [1 ]
Dong, Kezhen [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Monotonic entropy; Partially labeled heterogeneous data; ATTRIBUTE REDUCTION;
D O I
10.1016/j.ijar.2022.12.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of partially labeled heterogeneous feature selection (i.e., some samples, which own numerical and categorical features, have no labels). Existing solutions typically adopt linear correlations between features. In this paper, three different monotonic uncertainty measures are defined on equivalence classes and neighborhood classes to study the partially labeled heterogeneous feature selection by exploring the nonlinear correlations. First, consistent entropy and monotonic neighborhood entropy, based on classical rough set theory and neighborhood rough set theory, are proposed to construct a uniform measure for feature selection in heterogeneous datasets. Furthermore, a maximal neighborhood entropy strategy is developed by considering the inconsistency of neighborhood classes described by the features and partial labels. Finally, two feature selection algorithms are presented by three novel monotonic uncertainty measures. The comparative experiments demonstrate the effectiveness and superiority of the newly proposed feature selection measures.(c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:200 / 217
页数:18
相关论文
共 50 条
  • [11] Feature selection for imbalanced data based on neighborhood rough sets
    Chen, Hongmei
    Li, Tianrui
    Fan, Xin
    Luo, Chuan
    INFORMATION SCIENCES, 2019, 483 : 1 - 20
  • [12] Information-theoretic measures of uncertainty for rough sets and rough relational databases
    Beaubouef, T
    Petry, FE
    Arora, G
    INFORMATION SCIENCES, 1998, 109 (1-4) : 185 - 195
  • [13] An information-theoretic graph-based approach for feature selection
    Amit Kumar Das
    Sahil Kumar
    Samyak Jain
    Saptarsi Goswami
    Amlan Chakrabarti
    Basabi Chakraborty
    Sādhanā, 2020, 45
  • [14] An information-theoretic graph-based approach for feature selection
    Das, Amit Kumar
    Kumar, Sahil
    Jain, Samyak
    Goswami, Saptarsi
    Chakrabarti, Amlan
    Chakraborty, Basabi
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2020, 45 (01):
  • [15] Feature selection for classificatory analysis based on information-theoretic criteria
    Department of Automation, Harbin University of Science and Technology, Harbin 150080, China
    不详
    Zidonghua Xuebao, 2008, 3 (383-392):
  • [16] Neighborhood rough set based heterogeneous feature subset selection
    Hu, Qinghua
    Yu, Daren
    Liu, Jinfu
    Wu, Congxin
    INFORMATION SCIENCES, 2008, 178 (18) : 3577 - 3594
  • [17] Generalized Information-Theoretic Measures for Feature Selection
    Sluga, Davor
    Lotric, Uros
    ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, ICANNGA 2013, 2013, 7824 : 189 - 197
  • [18] Efficient information-theoretic unsupervised feature selection
    Lee, J.
    Seo, W.
    Kim, D. -W.
    ELECTRONICS LETTERS, 2018, 54 (02) : 76 - 77
  • [19] Information-theoretic measure of uncertainty in generalized fuzzy rough sets
    Mi, Ju-Sheng
    Li, Xiu-Min
    Zhao, Hui-Yin
    Feng, Tao
    ROUGH SETS, FUZZY SETS, DATA MINING AND GRANULAR COMPUTING, PROCEEDINGS, 2007, 4482 : 63 - +
  • [20] Feature Subset Selection Based on Variable Precision Neighborhood Rough Sets
    Chen, Yingyue
    Chen, Yumin
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 572 - 581