Exploring Hierarchical Multi-Label Text Classification Models using Attention-Based Approaches for Vietnamese language

被引:0
|
作者
Lam, Van [1 ,2 ]
Quach, Khoi [1 ,2 ]
Nguyen, Long [1 ,2 ]
Dinh, Dien [1 ,2 ]
机构
[1] Univ Sci Ho Chi Minh City, Fac Informat Technol, Ho Chi Minh City, Vietnam
[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
关键词
Hierarchical Attention-based Recurrent Neural Network; Word Embedding; Vietnamese articles;
D O I
10.1145/3639233.3639244
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Hierarchical Attention-based Recurrent Neural Network (HARNN) is a system designed to categorize documents efficiently, taking into account both the content of the texts and their hierarchical category structure. This system is comprised of three primary components: the Document Representation Layer (DRL), which is used for semantic encoding, the Hierarchical Attention-based Recurrent Layer (HARL), that models dependencies between different hierarchical levels, and the Hybrid Predicting Layer (HPL), which is responsible for accurate category predictions. In this research, we put HARNN to the test, using a dataset of Vietnamese articles from VnExpress. We then contrast the performance of four different word embeddings (Word2Vec, FastText, PhoBERT, and BERT multilingual). Additionally, we introduce a domain-based approach for the HARNN model to compare the accuracy with the original manner. Experimental findings indicate that HARNN performs effectively in the context of Vietnamese language and that our domain-based approach can be advantageous in specific domains HMTC task.
引用
收藏
页码:38 / 43
页数:6
相关论文
共 50 条
  • [31] HMATC: Hierarchical multi-label Arabic text classification model using machine learning
    Aljedani, Nawal
    Alotaibi, Reem
    Taileb, Mounira
    EGYPTIAN INFORMATICS JOURNAL, 2021, 22 (03) : 225 - 237
  • [32] Large Scale Multi-label Text Classification of a Hierarchical Dataset using Rocchio algorithm
    Sowmya, B. J.
    Chetan
    Srinivasa, K. G.
    2016 INTERNATIONAL CONFERENCE ON COMPUTATION SYSTEM AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTIONS (CSITSS), 2016, : 291 - 296
  • [33] An R-Transformer_BiLSTM Model Based on Attention for Multi-label Text Classification
    Yaoyao Yan
    Fang’ai Liu
    Xuqiang Zhuang
    Jie Ju
    Neural Processing Letters, 2023, 55 : 1293 - 1316
  • [34] An R-Transformer_BiLSTM Model Based on Attention for Multi-label Text Classification
    Yan, Yaoyao
    Liu, Fang'ai
    Zhuang, Xuqiang
    Ju, Jie
    NEURAL PROCESSING LETTERS, 2023, 55 (02) : 1293 - 1316
  • [35] Cognitive structure learning model for hierarchical multi-label text classification
    Wang, Boyan
    Hu, Xuegang
    Li, Peipei
    Yu, Philip S.
    KNOWLEDGE-BASED SYSTEMS, 2021, 218
  • [36] Hierarchical Sequence-to-Sequence Model for Multi-Label Text Classification
    Yang, Zhenyu
    Liu, Guojing
    IEEE ACCESS, 2019, 7 : 153012 - 153020
  • [38] Hierarchical Multi-label Text Classification with Horizontal and Vertical Category Correlations
    Xu, Linli
    Teng, Sijie
    Zhao, Ruoyu
    Guo, Junliang
    Xiao, Chi
    Jiang, Deqiang
    Ren, Bo
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 2459 - 2468
  • [39] Deep neural network for hierarchical extreme multi-label text classification
    Gargiulo, Francesco
    Silvestri, Stefano
    Ciampi, Mario
    De Pietro, Giuseppe
    APPLIED SOFT COMPUTING, 2019, 79 : 125 - 138
  • [40] Online multi-label dependency topic models for text classification
    Sophie Burkhardt
    Stefan Kramer
    Machine Learning, 2018, 107 : 859 - 886