Hierarchical Feature Selection Based on Label Distribution Learning

被引:34
|
作者
Lin, Yaojin [1 ]
Liu, Haoyang [1 ]
Zhao, Hong [1 ]
Hu, Qinghua [2 ]
Zhu, Xingquan [3 ]
Wu, Xindong [4 ]
机构
[1] Minnan Normal Univ, Sch Comp Sci, Key Lab Data Sci & Intelligence Applicat, Zhangzhou 363000, Fujian, Peoples R China
[2] Tianjin Univ, Sch Comp Sci, Tianjin 300354, Peoples R China
[3] Florida Atlantic Univ, Dept Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
[4] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Minist Educ, Hefei 230009, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Task analysis; Correlation; Electronic mail; Training; Dinosaurs; Computer science; Common and label-specific features; feature selection; hierarchical classification; label distribution learning; label enhancement; CLASSIFICATION;
D O I
10.1109/TKDE.2022.3177246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical classification learning, which organizes data categories into a hierarchical structure, is an effective approach for large-scale classification tasks. The high dimensionality of data feature space, represented in hierarchical class structures, is one of the main research challenges. In addition, the class hierarchy often introduces imbalanced class distributions and causes overfitting. In this paper, we propose a feature selection method based on label distribution learning to address the above challenges. The crux is to alleviate the class imbalance problem and learn a discriminative feature subset for hierarchical classification process. Due to correlation between different class categories in the hierarchical tree structure, sibling categories can provide additional supervisory information for each learning sub tasks, which, in turn, alleviates the problem of under-sampling of minority categories. Therefore, we transform hierarchical labels to a hierarchical label distribution to represent this correlation. After that, a discriminative feature subset is selected recursively, by the common features and label-specific feature constraints, to ensure that downstream classification tasks can achieve the best performance. Experiments and comparisons, using seven well-established feature selection algorithms on six real data sets with different degrees of imbalance, demonstrate the superiority of the proposed method.
引用
收藏
页码:5964 / 5976
页数:13
相关论文
共 50 条
  • [41] Label-dependent feature exploration for label distribution learning
    Bai, Run-Ting
    Zhang, Heng-Ru
    Min, Fan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (11) : 3685 - 3704
  • [42] Label-dependent feature exploration for label distribution learning
    Run-Ting Bai
    Heng-Ru Zhang
    Fan Min
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 3685 - 3704
  • [43] ProLSFEO-LDL: Prototype Selection and Label- Specific Feature Evolutionary Optimization for Label Distribution Learning
    Gonzalez, Manuel
    Cano, Jose-Ramon
    Garcia, Salvador
    APPLIED SCIENCES-BASEL, 2020, 10 (09):
  • [44] Relevance-based label distribution feature selection via convex optimization
    Qian, Wenbin
    Ye, Qianzhi
    Li, Yihui
    Huang, Jintao
    Dai, Shiming
    INFORMATION SCIENCES, 2022, 607 : 322 - 345
  • [45] Multi-label feature selection based on label correlations and feature redundancy
    Fan, Yuling
    Chen, Baihua
    Huang, Weiqin
    Liu, Jinghua
    Weng, Wei
    Lan, Weiyao
    KNOWLEDGE-BASED SYSTEMS, 2022, 241
  • [46] Multi-label Learning with Label-Specific Feature Selection
    Yan, Yan
    Li, Shining
    Yang, Zhe
    Zhang, Xiao
    Li, Jing
    Wang, Anyi
    Zhang, Jingyu
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 305 - 315
  • [47] A bipartite matching-based feature selection for multi-label learning
    Amin Hashemi
    Mohammad Bagher Dowlatshahi
    Hossein Nezamabadi-Pour
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 459 - 475
  • [48] Multi-label feature selection based on logistic regression and manifold learning
    Yao Zhang
    Yingcang Ma
    Xiaofei Yang
    Applied Intelligence, 2022, 52 : 9256 - 9273
  • [49] A bipartite matching-based feature selection for multi-label learning
    Hashemi, Amin
    Dowlatshahi, Mohammad Bagher
    Nezamabadi-Pour, Hossein
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (02) : 459 - 475
  • [50] Multi-label feature selection based on logistic regression and manifold learning
    Zhang, Yao
    Ma, Yingcang
    Yang, Xiaofei
    APPLIED INTELLIGENCE, 2022, 52 (08) : 9256 - 9273