Affect Analysis in Arabic Text: Further Pre-Training Language Models for Sentiment and Emotion

被引:3
|
作者
Alshehri, Wafa [1 ,2 ,3 ]
Al-Twairesh, Nora [1 ,4 ]
Alothaim, Abdulrahman [1 ,2 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, STCs Artificial Intelligence Chair, Riyadh 11451, Saudi Arabia
[2] King Saud Univ, Coll Comp & Informat Sci, Dept Informat Syst, Riyadh 11451, Saudi Arabia
[3] King Khalid Univ, Coll Sci & Arts, Dept Comp Sci, Almajarda 63931, Saudi Arabia
[4] King Saud Univ, Coll Comp & Informat Sci, Dept Informat Technol, Riyadh 11451, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 09期
关键词
sentiment analysis; emotion detection; pretrained language models; model adaptation; task-adaptation approach;
D O I
10.3390/app13095609
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
One of the main tasks in the field of natural language processing (NLP) is the analysis of affective states (sentiment and emotional) based on written text, and attempts have improved dramatically in recent years. However, in studies on the Arabic language, machine learning or deep learning algorithms were utilised to analyse sentiment and emotion more often than current pre-trained language models. Additionally, further pre-training the language model on specific tasks (i.e., within-task and cross-task adaptation) has not yet been investigated for Arabic in general, and for the sentiment and emotion task in particular. In this paper, we adapt a BERT-based Arabic pretrained language model for the sentiment and emotion tasks by further pre-training it on a sentiment and emotion corpus. Hence, we developed five new Arabic models: QST, QSR, QSRT, QE3, and QE6. Five sentiment and two emotion datasets spanning both small- and large-resource settings were used to evaluate the developed models. The adaptation approaches significantly enhanced the performance of seven Arabic sentiment and emotion datasets. The developed models showed excellent improvements over the sentiment and emotion datasets, which ranged from 0.15-4.71%.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Subsampling of Frequent Words in Text for Pre-training a Vision-Language Model
    Liang, Mingliang
    Larson, Martha
    [J]. PROCEEDINGS OF THE 1ST WORKSHOP ON LARGE GENERATIVE MODELS MEET MULTIMODAL APPLICATIONS, LGM3A 2023, 2023, : 61 - 67
  • [22] Framework for Sentiment Analysis of Arabic Text
    Almuqren, Latifah
    Cristea, Alexandra I.
    [J]. PROCEEDINGS OF THE 27TH ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA (HT'16), 2016, : 315 - 317
  • [23] MPNet-GRUs: Sentiment Analysis With Masked and Permuted Pre-Training for Language Understanding and Gated Recurrent Units
    Loh, Nicole Kai Ning
    Lee, Chin Poo
    Ong, Thian Song
    Lim, Kian Ming
    [J]. IEEE ACCESS, 2024, 12 : 74069 - 74080
  • [24] How does the pre-training objective affect what large language models learn about linguistic properties?
    Alajrami, Ahmed
    Aletras, Nikolaos
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, 2022, : 131 - 147
  • [25] Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training
    Li, Zhengyan
    Zou, Yicheng
    Zhang, Chong
    Zhang, Qi
    Wei, Zhongyu
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 246 - 256
  • [26] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alhanouf Alduailej
    Abdulrahman Alothaim
    [J]. Journal of Big Data, 9
  • [27] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alduailej, Alhanouf
    Alothaim, Abdulrahman
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [28] Towards Adversarial Attack on Vision-Language Pre-training Models
    Zhang, Jiaming
    Yi, Qi
    Sang, Jitao
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5005 - 5013
  • [29] Pre-training and Evaluating Transformer-based Language Models for Icelandic
    Daoason, Jon Friorik
    Loftsson, Hrafn
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7386 - 7391
  • [30] Pre-training Universal Language Representation
    Li, Yian
    Zhao, Hai
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 5122 - 5133