Feature Selection Algorithm for Hierarchical Text Classification Using Kullback-Leibler Divergence

被引:0
|
作者
Yao Lifang [1 ]
Qin Sijun [2 ]
Zhu Huan [2 ]
机构
[1] CUEB, Sch Stat, Beijing, Peoples R China
[2] CUC, New Media Inst, Beijing, Peoples R China
关键词
hierarchical text classification; KL divergence; text classification; hierarchical feature selection; category correlation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text classification, a simple and effective method, is considered as the key technology to deal with and organize a large amount of text data. At present, the simple text classification is unable to meet the increasing of user's demand, hierarchical text classification has received extensive attention and has broad application prospects. Hierarchical feature selection algorithm is the key technology of hierarchical text automatic classification, and the general method mainly aims at the individual feature selection of each class in the class hierarchy, and ignores the correlation between the parent and child class. This paper proposes a feature selection method based on KL divergence, measure the correlation between the class and subclasses by the KL divergence, calculate the correlation between each feature and sub class by Mutual Information method, measure the importance of subclasses characteristics using Term Frequency probability, to select the better discrimination set of features for parent class node. In this paper, we used hierarchical feature selection method and SVM classifiers for the hierarchical text categorization task on two corpora. Experiments showed the algorithm we proposed was effective, compared with the chi 2 statistic (CHI), information gain (IG), and mutual information (MI) that were used directly to select hierarchical feature.
引用
收藏
页码:421 / 424
页数:4
相关论文
共 50 条
  • [41] Optimal robust estimates using the Kullback-Leibler divergence
    Yohai, Victor J.
    [J]. STATISTICS & PROBABILITY LETTERS, 2008, 78 (13) : 1811 - 1816
  • [42] Kullback-Leibler Divergence based Graph Pruning in Robotic Feature Mapping
    Wang, Yue
    Xiong, Rong
    Li, Qianshan
    Huang, Shoudong
    [J]. 2013 EUROPEAN CONFERENCE ON MOBILE ROBOTS (ECMR 2013), 2013, : 32 - 37
  • [43] Remaining useful life prediction for rolling bearings using correlation coefficient and Kullback-Leibler divergence feature selection
    Liang, Pan
    Song, Xudong
    Wang, Shengqi
    Cong, Yuyang
    Chen, Yilin
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (02)
  • [44] Kullback-Leibler divergence based wind turbine fault feature extraction
    Wu, Yueqi
    Ma, Xiandong
    [J]. 2018 24TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC' 18), 2018, : 483 - 488
  • [45] Minimising the Kullback-Leibler Divergence for Model Selection in Distributed Nonlinear Systems
    Cliff, Oliver M.
    Prokopenko, Mikhail
    Fitch, Robert
    [J]. ENTROPY, 2018, 20 (02):
  • [46] A generalization of the Kullback-Leibler divergence and its properties
    Yamano, Takuya
    [J]. JOURNAL OF MATHEMATICAL PHYSICS, 2009, 50 (04)
  • [47] Pseudo-Online Classification of Mental Tasks Using Kullback-Leibler Symmetric Divergence
    Benevides, Alessandro B.
    Bastos Filho, Teodiano F.
    Sarcinelli Filho, Mario
    [J]. JOURNAL OF MEDICAL AND BIOLOGICAL ENGINEERING, 2012, 32 (06) : 411 - 416
  • [48] Optimal Viewpoint Selection Based on Aesthetic Composition Evaluation Using Kullback-Leibler Divergence
    Lan, Kai
    Sekiyama, Kosuke
    [J]. INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2016, PT I, 2016, 9834 : 433 - 443
  • [49] Computation of Kullback-Leibler Divergence in Bayesian Networks
    Moral, Serafin
    Cano, Andres
    Gomez-Olmedo, Manuel
    [J]. ENTROPY, 2021, 23 (09)
  • [50] Fault detection in dynamic systems using the Kullback-Leibler divergence
    Xie, Lei
    Zeng, Jiusun
    Kruger, Uwe
    Wang, Xun
    Geluk, Jaap
    [J]. CONTROL ENGINEERING PRACTICE, 2015, 43 : 39 - 48