External methods to address limitations of using global information on the narrow-down approach for hierarchical text classification

被引:0
|
作者
Oh, Heung-Seon [1 ]
Jung, Yuchul [1 ]
机构
[1] Korea Inst Sci & Technol Informat, Taejon 305806, South Korea
关键词
Hierarchical text classification; language models; web taxonomy;
D O I
10.1177/0165551514544626
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Classifying documents to a large-scale web taxonomy is a challenging research problem because of a large number of categories and associated documents in the taxonomy. The state-of-the-art solution known as the narrow-down approach utilizes a search engine to reduce an entire category hierarchy to most relevant categories and selects the best one among them using a classifier. In a recent language modelling approach, top-level category information (or global information) was used in judging the appropriateness of a local category, which led to performance improvements. However, we observe that using global information has a limited influence on the final category selection under some conditions. First, global information may be inaccurate even though it is generated by a top-level category classifier using an entire hierarchy. Second, it has little influence when two competing categories share the same top-level category or when the local category information has too strong an influence on the final category selection. To resolve the limitations, in this paper, we propose two external methods: (1) a meta-classifier with novel dependency features among top-level categories based on an ensemble learning framework; and (2) a query modification model based on a statistical feedback method to improve the query document representation instead of just juggling with information in the hierarchy. Our methods were evaluated using the Open Directory Project test collection.
引用
收藏
页码:688 / 708
页数:21
相关论文
共 11 条
  • [1] Novel top-down methods for Hierarchical Text Classification
    Cao Ying
    Duan run-ying
    [J]. INTERNATIONAL CONFERENCE ON ADVANCES IN ENGINEERING 2011, 2011, 24 : 329 - 334
  • [2] Utilizing global and path information with language modelling for hierarchical text classification
    Oh, Heung-Seon
    Myaeng, Sung-Hyon
    [J]. JOURNAL OF INFORMATION SCIENCE, 2014, 40 (02) : 127 - 145
  • [3] HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization
    Deng, Zhongfen
    Peng, Hao
    He, Dongxiao
    Li, Jianxin
    Yu, Philip S.
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3259 - 3265
  • [4] Affective - Hierarchical Classification of Text - An Approach Using NLP Toolkit
    Aathithyan, S. Seshathri
    Sriram, M. V.
    Prasanna, S.
    Venkatesan, R.
    [J]. PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT 2016), 2016,
  • [5] HScodeNet: Combining Hierarchical Sequential and Global Spatial Information of Text for Commodity HS Code Classification
    Du, Shaohua
    Wu, Zhihao
    Wan, Huaiyu
    Lin, YouFang
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 676 - 689
  • [6] Classification Methods of Text Documents Using Ontology Based Approach
    Lytvyn, Vasyl
    Vysotska, Victoria
    Veres, Oleh
    Rishnyak, Ihor
    Rishnyak, Halya
    [J]. ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING, CSIT 2016, 2017, 512 : 229 - 240
  • [7] Information extraction and classification from free text using a neural approach
    Gallo, Ignazio
    Binagbi, Elisabetta
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 921 - 929
  • [8] New Top-Down Methods Using SVMs for Hierarchical Multilabel Classification Problems
    Cerri, Ricardo
    de Carvalho, Andre Carlos P. L. F.
    [J]. 2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [9] An Improvement of Flat Approach on Hierarchical Text Classification Using Top-Level Pruning Classifiers
    Phachongkitphiphat, Natchanon
    Vateekul, Peerapon
    [J]. 2014 11TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2014, : 86 - 90
  • [10] A Medical Case-Based Reasoning Approach Using Image Classification and Text Information for Recommendation
    Nasiri, Sara
    Zenkert, Johannes
    Fathi, Madjid
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, PT II, 2015, 9095 : 43 - 55