Exploring Hierarchical Multi-Label Text Classification Models using Attention-Based Approaches for Vietnamese language

被引:0
|
作者
Lam, Van [1 ,2 ]
Quach, Khoi [1 ,2 ]
Nguyen, Long [1 ,2 ]
Dinh, Dien [1 ,2 ]
机构
[1] Univ Sci Ho Chi Minh City, Fac Informat Technol, Ho Chi Minh City, Vietnam
[2] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
关键词
Hierarchical Attention-based Recurrent Neural Network; Word Embedding; Vietnamese articles;
D O I
10.1145/3639233.3639244
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Hierarchical Attention-based Recurrent Neural Network (HARNN) is a system designed to categorize documents efficiently, taking into account both the content of the texts and their hierarchical category structure. This system is comprised of three primary components: the Document Representation Layer (DRL), which is used for semantic encoding, the Hierarchical Attention-based Recurrent Layer (HARL), that models dependencies between different hierarchical levels, and the Hybrid Predicting Layer (HPL), which is responsible for accurate category predictions. In this research, we put HARNN to the test, using a dataset of Vietnamese articles from VnExpress. We then contrast the performance of four different word embeddings (Word2Vec, FastText, PhoBERT, and BERT multilingual). Additionally, we introduce a domain-based approach for the HARNN model to compare the accuracy with the original manner. Experimental findings indicate that HARNN performs effectively in the context of Vietnamese language and that our domain-based approach can be advantageous in specific domains HMTC task.
引用
收藏
页码:38 / 43
页数:6
相关论文
共 50 条
  • [21] Multi-label classification of legislative contents with hierarchical label attention networks
    Danielle Caled
    Mário J. Silva
    Bruno Martins
    Miguel Won
    International Journal on Digital Libraries, 2022, 23 : 77 - 90
  • [22] Multi-label classification of legislative contents with hierarchical label attention networks
    Caled, Danielle
    Silva, Mario J.
    Martins, Bruno
    Won, Miguel
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2022, 23 (01) : 77 - 90
  • [23] TaxoClass: Hierarchical Multi-Label Text Classification Using Only Class Names
    Shen, Jiaming
    Qiu, Wenda
    Meng, Yu
    Shang, Jingbo
    Ren, Xiang
    Han, Jiawei
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 4239 - 4249
  • [24] An Interactive Fusion Model for Hierarchical Multi-label Text Classification
    Zhao, Xiuhao
    Li, Zhao
    Zhang, Xianming
    Wang, Jibin
    Chen, Tong
    Ju, Zhengyu
    Wang, Canjun
    Zhang, Chao
    Zhan, Yiming
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT II, 2022, 13552 : 168 - 178
  • [25] Multi-label classification using hierarchical embedding
    Kumar, Vikas
    Pujari, Arun K.
    Padmanabhan, Vineet
    Sahu, Sandeep Kumar
    Kagita, Venkateswara Rao
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 91 : 263 - 269
  • [26] Hierarchical text classification with multi-label contrastive learning and KNN
    Zhang, Jun
    Li, Yubin
    Shen, Fanfan
    He, Yueshun
    Tan, Hai
    He, Yanxiang
    NEUROCOMPUTING, 2024, 577
  • [27] Incorporating keyword extraction and attention for multi-label text classification
    Zhao, Hua
    Li, Xiaoqian
    Wang, Fengling
    Zeng, Qingtian
    Diao, Xiuli
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (02) : 2083 - 2093
  • [28] Multi-label Text Classification of German Language Medical Documents
    Spat, Stephan
    Cadonna, Bruno
    Rakovac, Ivo
    Guetl, Christian
    Leitner, Hubert
    Stark, Guenther
    Beck, Peter
    MEDINFO 2007: PROCEEDINGS OF THE 12TH WORLD CONGRESS ON HEALTH (MEDICAL) INFORMATICS, PTS 1 AND 2: BUILDING SUSTAINABLE HEALTH SYSTEMS, 2007, 129 : 1460 - +
  • [29] Multi-Label Text Classification Based on DistilBERT and Label Correlation
    Wang, Xuyang
    Geng, Liuqing
    Zhang, Xin
    Computer Engineering and Applications, 2024, 60 (23) : 168 - 175
  • [30] Multi-Label Text Classification Model Based on Multi-Level Constraint Augmentation and Label Association Attention
    Wei, Xiao
    Huang, Jianbao
    Zhao, Rui
    Yu, Hang
    Xu, Zheng
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (01)