Pre-Trained Language Model-Based Deep Learning for Sentiment Classification of Vietnamese Feedback

被引:0
|
作者
Loc, Cu Vinh [1 ]
Viet, Truong Xuan [1 ]
Viet, Tran Hoang [1 ]
Thao, Le Hoang [1 ]
Viet, Nguyen Hoang [1 ]
机构
[1] Can Tho Univ, Software Ctr, Can Tho city, Vietnam
关键词
Sentiment analysis; PhoBERT; deep learning; text classification; Vietnamese feedback;
D O I
10.1142/S1469026823500165
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, with the strong and outstanding development of the Internet, the need to refer to the feedback of previous customers when shopping online is increasing. Therefore, websites are developed to allow users to share experiences, reviews, comments and feedback about the services and products of businesses and organizations. The organizations also collect user feedback about their products and services to give better directions. However, with a large amount of user feedback about certain services and products, it is difficult for users, businesses, and organizations to pay attention to them all. Thus, an automatic system is necessary to analyze the sentiment of a customer feedback. Recently, the well-known pre-trained language models for Vietnamese (PhoBERT) achieved high performance in comparison with other approaches. However, this method may not focus on the local information in the text like phrases or fragments. In this paper, we propose a Convolutional Neural Network (CNN) model based on PhoBERT for sentiment classification. The output of contextualized embeddings of the PhoBERT's last four layers is fed into the CNN. This makes the network capable of obtaining more local information from the sentiment. Besides, the PhoBERT output is also given to the transformer encoder layers in order to employ the self-attention technique, and this also makes the model more focused on the important information of the sentiment segments. The experimental results demonstrate that the proposed approach gives competitive performance compared to the existing studies on three public datasets with the opinions of Vietnamese people.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] PhoBERT: Pre-trained language models for Vietnamese
    Dat Quoc Nguyen
    Anh Tuan Nguyen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
  • [12] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alhanouf Alduailej
    Abdulrahman Alothaim
    Journal of Big Data, 9
  • [13] Spanish Pre-Trained CaTrBETO Model for Sentiment Classification in Twitter
    Pijal, Washington
    Armijos, Arianna
    Llumiquinga, Jose
    Lalvay, Sebastian
    Allauca, Steven
    Cuenca, Erick
    2022 THIRD INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND SOFTWARE TECHNOLOGIES, ICI2ST, 2022, : 93 - 98
  • [14] Pre-trained Language Model-based Retrieval and Ranking forWeb Search
    Zou, Lixin
    Lu, Weixue
    Liu, Yiding
    Cai, Hengyi
    Chu, Xiaokai
    Ma, Dehong
    Shi, Daiting
    Sun, Yu
    Cheng, Zhicong
    Gu, Simiu
    Wang, Shuaiqiang
    Yin, Dawei
    ACM TRANSACTIONS ON THE WEB, 2023, 17 (01)
  • [15] Aspect Based Sentiment Analysis by Pre-trained Language Representations
    Liang Tianxin
    Yang Xiaoping
    Zhou Xibo
    Wang Bingqian
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 1262 - 1265
  • [16] Robust Sentiment Classification of Metaverse Services Using a Pre-trained Language Model with Soft Voting
    Lee, Haein
    Jung, Hae Sun
    Lee, Seon Hong
    Kim, Jang Hyun
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2023, 17 (09): : 2334 - 2347
  • [17] TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations
    Azzouza, Noureddine
    Akli-Astouati, Karima
    Ibrahim, Roliana
    EMERGING TRENDS IN INTELLIGENT COMPUTING AND INFORMATICS: DATA SCIENCE, INTELLIGENT INFORMATION SYSTEMS AND SMART COMPUTING, 2020, 1073 : 428 - 437
  • [18] Pre-Trained Model-Based NFR Classification: Overcoming Limited Data Challenges
    Rahman, Kiramat
    Ghani, Anwar
    Alzahrani, Abdulrahman
    Tariq, Muhammad Usman
    Rahman, Arif Ur
    IEEE ACCESS, 2023, 11 : 81787 - 81802
  • [19] A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese
    Liao, Xianwen
    Huang, Yongzhong
    Yang, Peng
    Chen, Lei
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)
  • [20] ViSoBERT: A Pre-Trained Language Model for Vietnamese Social Media Text Processing
    Nguyen, Quoc-Nam
    Thang Chau Phan
    Nguyen, Duc-Vu
    Kiet Van Nguyen
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5191 - 5207