TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations

被引:22
|
作者
Azzouza, Noureddine [1 ]
Akli-Astouati, Karima [1 ]
Ibrahim, Roliana [2 ]
机构
[1] Univ Sci & Technol Houari Boumediene, FEI Dept Comp Sci, RIIMA Lab, Algiers, Algeria
[2] Univ Teknol Malaysia UTM, Fac Engn, Sch Comp, Johor Baharu 81310, Johor, Malaysia
关键词
Twitter Sentiment Analysis; Word embedding; CNN; LSTM; BERT;
D O I
10.1007/978-3-030-33582-3_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentiment analysis has been a topic of discussion in the exploration domain of language understanding. Yet, the neural networks deployed in it are deficient to some extent. Currently, the majority of the studies proceeds on identifying the sentiments by focusing on vocabulary and syntax. Moreover, the task is recognised in Natural Language Processing (NLP) and, for calculating the noteworthy and exceptional results, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have been employed. In this study, we propose a four-phase framework for Twitter Sentiment Analysis. This setup is based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model as an encoder for generating sentence depictions. For more effective utilisation of this model, we deploy various classification models. Additionally, we concatenate pre-trained representations of word embeddings with BERT representation method to enhance sentiment classification. Experimental results show better implementation when it is evaluated against the baseline framework on all datasets. For example, our best model attains an F1-score of 71.82% on the SemEval 2017 dataset. A comparative analysis on experimental results offers some recommendations on choosing pretraining steps to obtain improved results. The outcomes of the experiment confirm the effectiveness of our system.
引用
收藏
页码:428 / 437
页数:10
相关论文
共 50 条
  • [1] Aspect Based Sentiment Analysis by Pre-trained Language Representations
    Liang Tianxin
    Yang Xiaoping
    Zhou Xibo
    Wang Bingqian
    [J]. 2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 1262 - 1265
  • [2] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alduailej, Alhanouf
    Alothaim, Abdulrahman
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [3] Leveraging Pre-trained Language Model for Speech Sentiment Analysis
    Shon, Suwon
    Brusco, Pablo
    Pan, Jing
    Han, Kyu J.
    Watanabe, Shinji
    [J]. INTERSPEECH 2021, 2021, : 3420 - 3424
  • [4] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alhanouf Alduailej
    Abdulrahman Alothaim
    [J]. Journal of Big Data, 9
  • [5] Spanish Pre-Trained CaTrBETO Model for Sentiment Classification in Twitter
    Pijal, Washington
    Armijos, Arianna
    Llumiquinga, Jose
    Lalvay, Sebastian
    Allauca, Steven
    Cuenca, Erick
    [J]. 2022 THIRD INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND SOFTWARE TECHNOLOGIES, ICI2ST, 2022, : 93 - 98
  • [6] Pre-trained Language Model Representations for Language Generation
    Edunov, Sergey
    Baevski, Alexei
    Auli, Michael
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059
  • [7] An Enhanced Sentiment Analysis Framework Based on Pre-Trained Word Embedding
    Mohamed, Ensaf Hussein
    Moussa, Mohammed ElSaid
    Haggag, Mohamed Hassan
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2020, 19 (04)
  • [8] Incorporating Dynamic Semantics into Pre-Trained Language Model for Aspect-based Sentiment Analysis
    Zhang, Kai
    Zhang, Kun
    Zhang, Mengdi
    Zhao, Hongke
    Liu, Qi
    Wu, Wei
    Chen, Enhong
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3599 - 3610
  • [9] An Entity-Level Sentiment Analysis of Financial Text Based on Pre-Trained Language Model
    Huang, Zhihong
    Fang, Zhijian
    [J]. 2020 IEEE 18TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), VOL 1, 2020, : 391 - 396
  • [10] PoliBERTweet: A Pre-trained Language Model for Analyzing Political Content on Twitter
    Kawintiranon, Kornraphop
    Singh, Lisa
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7360 - 7367