Experiments with hierarchical text classification

被引:0
|
作者
Granitzer, M [1 ]
Auer, P [1 ]
机构
[1] Know Ctr, Div Knowledge Discovery, A-8010 Graz, Austria
关键词
machine learning; supervised learning; hierarchical text classification; boosting; ranking performance;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper applies Boosting to hierarchical text classification where the hierarchical structure is given as directed acyclic graph and compares the results to Support Vector Machines. Hierarchical classification is performed top-down and in each node a flat classifier decides if a document should be further propagated or not. As flat classifiers BoosTexter, CentroidBooster and Support Vector Machines are used, were CentroidBooster is an AdaBoost.MH based alternative similar to BoosTexter. Experiments on the Reuters Corpus Volume 1 and the OHSUMED data set show that the F-1-measure increases if the hierarchal structure of a data set is taken into account. Regarding time complexity we show, that depending on the structure of a hierarchy, learning and classification time can be reduced. Besides these hard classification approaches we also investigate the ranking performance of hierarchical classifiers. Ranking, which can be achieved by providing a meaningful score for each classification decision, is important in most practical settings. We investigate an approach based on using a sigmoid function for calculating a meaningful score, where parameter estimation is based on error bounds from computational learning theory.
引用
收藏
页码:177 / 182
页数:6
相关论文
共 50 条
  • [31] Hierarchical Comprehensive Context Modeling for Chinese Text Classification
    Liu, Jingang
    Xia, Chunhe
    Yan, Haihua
    Xie, Zhipu
    Sun, Jie
    IEEE ACCESS, 2019, 7 : 154546 - 154559
  • [32] Hierarchical Text Classification based on LDA and Domain Ontology
    An, Wei
    Liu, Qihua
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 1112 - +
  • [33] An effective procedure for constructing a hierarchical text classification system
    Yoon, Y
    Lee, C
    Lee, GG
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2006, 57 (03): : 431 - 442
  • [34] A Text Classification Algorithm Based on Rocchio and Hierarchical Clustering
    Zeng, Anping
    Huang, Yongping
    ADVANCED INTELLIGENT COMPUTING, 2011, 6838 : 432 - +
  • [35] Hierarchical Multilabel Text Classification via Multitask Learning
    Yu, Yipeng
    Sun, Zixun
    Sun, Chi
    Liu, Wenqiang
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 1138 - 1143
  • [36] Text Learning and Hierarchical Feature Selection in Webpage Classification
    Peng, Xiaogang
    Ming, Zhong
    Wang, Haitao
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 452 - 459
  • [37] An analysis of hierarchical text classification using word embeddings
    Stein, Roger Alan
    Jaques, Patricia A.
    Valiati, Joao Francisco
    INFORMATION SCIENCES, 2019, 471 : 216 - 232
  • [38] Hierarchical text classification based on support vector machines
    Jin, Ting
    Lei, Jingsheng
    Journal of Information and Computational Science, 2009, 6 (01): : 543 - 551
  • [39] Text Classification using Hierarchical Sparse Representation Classifiers
    Sharma, Neeraj
    Dileep, A. D.
    Thenkanidiyoor, Veena
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 1015 - 1019
  • [40] Hierarchical Hamming clustering model in text document classification
    Diao, Q
    Diao, HN
    Wang, YC
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN & COMPUTER GRAPHICS, 1999, : 1299 - 1303