Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon

被引:0
|
作者
Koto, Fajri [1 ]
Beck, Tilman [2 ]
Talat, Zeerak [1 ]
Gurevych, Iryna [1 ]
Baldwin, Timothy [1 ,3 ]
机构
[1] MBZUAI, Dept Nat Language Proc, Abu Dhabi, U Arab Emirates
[2] Tech Univ Darmstadt, Ubiquitous Knowledge Proc Lab, Darmstadt, Germany
[3] Univ Melbourne, Melbourne, Vic, Australia
关键词
NORMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Improving multilingual language models capabilities in low-resource languages is generally difficult due to the scarcity of large-scale data in those languages. In this paper, we relax the reliance on texts in low-resource languages by using multilingual lexicons in pretraining to enhance multilingual capabilities. Specifically, we focus on zero-shot sentiment analysis tasks across 34 languages, including 6 high/medium-resource languages, 25 low-resource languages, and 3 code-switching datasets. We demonstrate that pretraining using multilingual lexicons, without using any sentence-level sentiment data, achieves superior zero-shot performance compared to models fine-tuned on English sentiment datasets, and large language models like GPT-3.5, BLOOMZ, and XGLM. These findings are observable for unseen low-resource languages to code-mixed scenarios involving high-resource languages.(1)
引用
收藏
页码:298 / 320
页数:23
相关论文
共 50 条
  • [1] Zero-Shot Multilingual Sentiment Analysis using Hierarchical Attentive Network and BERT
    Sarkar, Anindya
    Reddy, Sujeeth
    Iyengar, Raghu Sesha
    NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 49 - 56
  • [2] Building lexicon-based sentiment analysis model for low-resource languages
    Mohammed, Idi
    Prasad, Rajesh
    METHODSX, 2023, 11
  • [3] Optimizing Multilingual Sentiment Analysis in Low-Resource Languages with Adaptive Pretraining and Strategic Language Selection
    Raychawdhary, Nilanjana
    Das, Amit
    Bhattacharya, Sutanu
    Dozier, Gerry
    Seals, Cheryl D.
    2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
  • [4] Zero-Shot Relation Triple Extraction with Prompts for Low-Resource Languages
    Halike, Ayiguli
    Wumaier, Aishan
    Yibulayin, Tuergen
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [5] UniSent: Universal Sentiment Analysis System for Low-Resource Languages
    Jabreel, Mohammed
    Maaroof, Najlaa
    Valls, Aida
    Moreno, Antonio
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 387 - 396
  • [6] AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
    Ebrahimi, Abteen
    Mager, Manuel
    Oncevay, Arturo
    Chaudhary, Vishrav
    Chiruzzo, Luis
    Fan, Angela
    Ortega, John E.
    Ramos, Ricardo
    Rios, Annette
    Meza-Ruiz, Ivan
    Gimenez-Lugo, Gustavo A.
    Mager, Elisabeth
    Neubig, Graham
    Palmer, Alexis
    Coto-Solano, Rolando
    Ngoc Thang Vu
    Kann, Katharina
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6279 - 6299
  • [7] Comparative Analysis of Transformer Models for Sentiment Analysis in Low-Resource Languages
    Aliyu, Yusuf
    Sarlan, Aliza
    Danyaro, Kamaluddeen Usman
    Rahman, Abdulahi Sani B. A.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (04) : 353 - 364
  • [8] Leveraging Multilingual Transformer for Multiclass Sentiment Analysis in Code-Mixed Data of Low-Resource Languages
    Nazir, Muhammad Kashif
    Faisal, Cm Nadeem
    Habib, Muhammad Asif
    Ahmad, Haseeb
    IEEE ACCESS, 2025, 13 : 7538 - 7554
  • [9] Lexicon-based fine-tuning of multilingual language models for low-resource language sentiment analysis
    Dhananjaya, Vinura
    Ranathunga, Surangika
    Jayasena, Sanath
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (05) : 1116 - 1125
  • [10] Examining Sentiment Analysis for Low-Resource Languages with Data Augmentation Techniques
    Thakkar, Gaurish
    Preradovic, Nives Mikelic
    Tadic, Marko
    ENG, 2024, 5 (04): : 2920 - 2942