Hierarchical Feature Selection Based on Label Distribution Learning

被引:34
|
作者
Lin, Yaojin [1 ]
Liu, Haoyang [1 ]
Zhao, Hong [1 ]
Hu, Qinghua [2 ]
Zhu, Xingquan [3 ]
Wu, Xindong [4 ]
机构
[1] Minnan Normal Univ, Sch Comp Sci, Key Lab Data Sci & Intelligence Applicat, Zhangzhou 363000, Fujian, Peoples R China
[2] Tianjin Univ, Sch Comp Sci, Tianjin 300354, Peoples R China
[3] Florida Atlantic Univ, Dept Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
[4] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Minist Educ, Hefei 230009, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Task analysis; Correlation; Electronic mail; Training; Dinosaurs; Computer science; Common and label-specific features; feature selection; hierarchical classification; label distribution learning; label enhancement; CLASSIFICATION;
D O I
10.1109/TKDE.2022.3177246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical classification learning, which organizes data categories into a hierarchical structure, is an effective approach for large-scale classification tasks. The high dimensionality of data feature space, represented in hierarchical class structures, is one of the main research challenges. In addition, the class hierarchy often introduces imbalanced class distributions and causes overfitting. In this paper, we propose a feature selection method based on label distribution learning to address the above challenges. The crux is to alleviate the class imbalance problem and learn a discriminative feature subset for hierarchical classification process. Due to correlation between different class categories in the hierarchical tree structure, sibling categories can provide additional supervisory information for each learning sub tasks, which, in turn, alleviates the problem of under-sampling of minority categories. Therefore, we transform hierarchical labels to a hierarchical label distribution to represent this correlation. After that, a discriminative feature subset is selected recursively, by the common features and label-specific feature constraints, to ensure that downstream classification tasks can achieve the best performance. Experiments and comparisons, using seven well-established feature selection algorithms on six real data sets with different degrees of imbalance, demonstrate the superiority of the proposed method.
引用
下载
收藏
页码:5964 / 5976
页数:13
相关论文
共 50 条
  • [31] Submodular Feature Selection for Partial Label Learning
    Bao, Wei-Xuan
    Hang, Jun-Yi
    Zhang, Min-Ling
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 26 - 34
  • [32] Multi-label feature selection based on rough granular-ball and label distribution
    Qian, Wenbin
    Xu, Fankang
    Qian, Jin
    Shu, Wenhao
    Ding, Weiping
    INFORMATION SCIENCES, 2023, 650
  • [33] Online feature selection for hierarchical classification learning based on improved ReliefF
    Wang, Chenxi
    Ren, Mengli
    Chen, E.
    Guo, Lei
    Yu, Xiehua
    Lin, Yaojin
    Li, Shaozi
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (27):
  • [34] A novel granular ball computing-based fuzzy rough set for feature selection in label distribution learning
    Qian, Wenbin
    Xu, Fankang
    Huang, Jintao
    Qian, Jin
    KNOWLEDGE-BASED SYSTEMS, 2023, 278
  • [35] Multi-label learning with Relief-based label-specific feature selection
    Zhang, Jiadong
    Liu, Keyu
    Yang, Xibei
    Ju, Hengrong
    Xu, Suping
    APPLIED INTELLIGENCE, 2023, 53 (15) : 18517 - 18530
  • [36] Discriminative label correlation based robust structure learning for multi-label feature selection
    Jia Q.
    Deng T.
    Wang Y.
    Wang C.
    Pattern Recognition, 2024, 154
  • [37] Integrating label confidence-based feature selection for partial multi-label learning
    Han, Qingqi
    Hu, Liang
    Gao, Wanfu
    Pattern Recognition, 2025, 161
  • [38] Multi-label learning with Relief-based label-specific feature selection
    Jiadong Zhang
    Keyu Liu
    Xibei Yang
    Hengrong Ju
    Suping Xu
    Applied Intelligence, 2023, 53 : 18517 - 18530
  • [39] Feature selection for label distribution learning using Dempster-Shafer evidence theory
    Zhao, Zhengwei
    Wang, Rongrong
    Pang, Wei
    Li, Zhaowen
    Applied Intelligence, 2025, 55 (04)
  • [40] Hierarchical Label Distribution Learning for Disease Prediction
    Ren, Yi
    Xia, Jing
    Yu, Ziyi
    Zhang, Zhenchuan
    Zhou, Tianshu
    Tian, Yu
    Li, Jingsong
    MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 755 - 759