Label-correlation-based Common and Specific Feature Selection for Hierarchical Classification

被引:0
|
作者
Lin Y.-J. [1 ,2 ]
Bai S.-X. [1 ,2 ]
Zhao H. [1 ,2 ]
Li S.-Z. [3 ]
Hu Q.-H. [4 ]
机构
[1] School of Computer Science, Minnan Normal University, Zhangzhou
[2] Key Laboratory of Data Science and Intelligent Application, Minnan Normal University, Zhangzhou
[3] Department of Artificial Intelligence, Xiamen University, Xiamen
[4] College of Intelligence and Computing, Tianjin University, Tianjin
来源
Ruan Jian Xue Bao/Journal of Software | 2022年 / 33卷 / 07期
关键词
common features; feature selection; hierarchical classification; recursive regularization; specific feature;
D O I
10.13328/j.cnki.jos.006335
中图分类号
学科分类号
摘要
In the era of big data, the sizes of data sets in terms of the number of samples, features, and classes have dramatically increased, and the classes usually exists a hierarchical structure. It is of great significance to select features for hierarchical data. In recent years, relevant feature selection algorithms have been proposed. However, the existing algorithms do not take full advantage of the information of the hierarchical structure of classes, and ignore the common and specific features of different class nodes. This study proposes a label-correlation-based feature selection algorithm for hierarchical classification with common and specific features. The algorithm uses recursive regularization to select the corresponding specific features for each internal node of the hierarchical structure, and makes full use of the hierarchical structure to analyze the label correlation, and then utilizes regularized penalty to select the common features of each subtree. Finally, the proposed model not only can address hierarchical tree data, but also can address more complex hierarchical DAG data directly. Experimental results on six hierarchical tree data sets and four hierarchical DAG data sets demonstrate the effectiveness of the proposed algorithm. © 2022 Chinese Academy of Sciences. All rights reserved.
引用
下载
收藏
页码:2667 / 2682
页数:15
相关论文
共 37 条
  • [1] Babbar R, Partalas I, Gaussier E, Amini MR, Amblard C., Learning taxonomy adaptation in large-scale classification, Journal of Machine Learning Research, 17, 1, (2016)
  • [2] Deng J, Dong W, Socher R, Li LJ, Li K, Li FF., ImageNet: A large-scale hierarchical image database, Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 248-255, (2009)
  • [3] Silla CN, Freitas AA., A survey of hierarchical classification across different application domains, Data Mining and Knowledge Discovery, 22, 1-2, (2011)
  • [4] Lin YJ, Hu QH, Liu JH, Li JJ, Wu XD., Streaming feature selection for multi-label learning based on fuzzy mutual information, IEEE Trans. on Fuzzy Systems, 25, 6, (2017)
  • [5] Hu QH, Yu DR, Xie ZX., Numerical attribute reduction based on neighborhood granulation and rough approximation, Ruan Jian Xue Bao/Journal of Software, 19, 3, (2008)
  • [6] Gu QQ, Li ZH, Han JW., Generalized fisher score for feature selection, Proc. of the ACM Conf. on Uncertainty in Artificial Intelligence, pp. 266-273, (2012)
  • [7] Nie FP, Huang H, Cai X, Ding C., Efficient and robust feature selection via joint L<sub>2,1</sub>-norms minimization, Proc. of the ACM Int’l Conf. on Neural Information Processing Systems, pp. 1813-1821, (2010)
  • [8] Peng HC, Long FH, Ding C., Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. on Pattern Analysis and Machine Intelligence, 27, 8, (2005)
  • [9] He XF, Cai D, Niyogi P., Laplacian score for feature selection, Proc. of the ACM Int’l Conf. on Neural Information Processing Systems, pp. 507-514, (2005)
  • [10] Hu QH, Wang Y, Zhou YC, Zhao H, Qian YH, Liang JY., Review on hierarchical learning methods for large-scale classification task, Scientia Sinica Informations, 48, 5, (2018)