A Study of Vietnamese Sentiment Classification with Ensemble Pre-trained Language Models

被引:0
|
作者
Thin, Dang Van [1 ]
Hao, Duong Ngoc [1 ]
Nguyen, Ngan Luu-Thuy [1 ]
机构
[1] Univ Informat Technol, Vietnam Natl Univ Ho Chi Minh City, Quarter 6, Ho Chi Minh City, Vietnam
关键词
Sentiment analysis; aspect-based sentiment analysis; contextual language models; ensemble methods; Vietnamese language;
D O I
10.1142/S2196888823500173
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment Analysis (SA) has attracted increasing research attention in recent years. Most existing works tackle the SA task by fine-tuning single pre-trained language models combined with specific layers. Despite their effectiveness, the previous studies overlooked the utilization of feature representations from various contextual language models. Ensemble learning techniques have garnered increasing attention within the field of Natural Language Processing (NLP). However, there is still room for improvement in ensemble models for the SA task, particularly in the aspect-level SA task. Furthermore, the utilization of heterogeneous ensembles, which combine various pre-trained transformer-based language models, may prove beneficial in enhancing overall performance by incorporating diverse linguistic representations. This paper introduces two ensemble models that leverage soft voting and feature fusion techniques by combining individual pre-trained transformer-based language models for the SA task. The latest transformer-based models, including PhoBERT, XLM, XLM-Align, InfoXLM, and viBERT_FPT, are employed to integrate knowledge and representations using feature fusion and a soft voting strategy. We conducted extensive experiments on various Vietnamese benchmark datasets, encompassing sentence-level, document-level, and aspect-level SA. The experimental results demonstrate that our approaches outperform most existing methods, achieving new state-of-the-art results with F1-weighted scores of 94.03%, 95.65%, 75.36%, and 76.23% on the UIT_VSFC, Aivivn, UIT_ABSA for the restaurant domain, and UIT_ViSFD datasets, respectively.
引用
收藏
页码:137 / 165
页数:29
相关论文
共 50 条
  • [1] PhoBERT: Pre-trained language models for Vietnamese
    Dat Quoc Nguyen
    Anh Tuan Nguyen
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
  • [2] Pre-Trained Language Model-Based Deep Learning for Sentiment Classification of Vietnamese Feedback
    Loc, Cu Vinh
    Viet, Truong Xuan
    Viet, Tran Hoang
    Thao, Le Hoang
    Viet, Nguyen Hoang
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2023, 22 (03)
  • [3] Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
    An Pha Le
    Tran Vu Pham
    Thanh-Van Le
    Huynh, Duy, V
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES (ICMLANT II), 2021, : 84 - 88
  • [4] ViHealthBERT: Pre-trained Language Models for Vietnamese in Health Text Mining
    Minh Phuc Nguyen
    Vu Hoang Tran
    Vu Hoang
    Ta Duc Huy
    Bui, Trung H.
    Truong, Steven Q. H.
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 328 - 337
  • [5] Enhancing Turkish Sentiment Analysis Using Pre-Trained Language Models
    Koksal, Omer
    [J]. 29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [6] A Comparative Study of Using Pre-trained Language Models for Toxic Comment Classification
    Zhao, Zhixue
    Zhang, Ziqi
    Hopfgartner, Frank
    [J]. WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 500 - 507
  • [7] Pre-trained Language Models with Limited Data for Intent Classification
    Kasthuriarachchy, Buddhika
    Chetty, Madhu
    Karmakar, Gour
    Walls, Darren
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [8] Focused Contrastive Loss for Classification With Pre-Trained Language Models
    He, Jiayuan
    Li, Yuan
    Zhai, Zenan
    Fang, Biaoyan
    Thorne, Camilo
    Druckenbrodt, Christian
    Akhondi, Saber
    Verspoor, Karin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (07) : 3047 - 3061
  • [9] Issue Report Classification Using Pre-trained Language Models
    Colavito, Giuseppe
    Lanubile, Filippo
    Novielli, Nicole
    [J]. 2022 IEEE/ACM 1ST INTERNATIONAL WORKSHOP ON NATURAL LANGUAGE-BASED SOFTWARE ENGINEERING (NLBSE 2022), 2022, : 29 - 32
  • [10] Error Investigation of Pre-trained BERTology Models on Vietnamese Natural Language Inference
    Tin Van Huynh
    Huy Quoc To
    Kiet Van Nguyen
    Ngan Luu-Thuy Nguyen
    [J]. RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, 2022, 1716 : 176 - 188