A comparative study of cross-lingual sentiment analysis

被引:6
|
作者
Priban, Pavel [1 ,2 ]
Smid, Jakub [1 ]
Steinberger, Josef [1 ]
Mistera, Adam [1 ]
机构
[1] Univ West Bohemia, Fac Appl Sci, Dept Comp Sci & Engn, Univ 8, Plzen 30100, Czech Republic
[2] NTIS New Technol Informat Soc, Univ 8, Plzen 30100, Czech Republic
关键词
Sentiment analysis; Zero-shot cross-lingual classification; Linear transformation; Transformers; Large language models; Transfer learning;
D O I
10.1016/j.eswa.2024.123247
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a detailed comparative study of the zero -shot cross -lingual sentiment analysis. Namely, we use modern multilingual Transformer -based models and linear transformations combined with CNN and LSTM neural networks. We evaluate their performance in Czech, French, and English. We aim to compare and assess the models' ability to transfer knowledge across languages and discuss the trade-off between their performance and training/inference speed. We build strong monolingual baselines comparable with the current SotA approaches, achieving state-of-the-art results in Czech (96.0% accuracy) and French (97.6% accuracy). Next, we compare our results with the latest large language models (LLMs), i.e., Llama 2 and ChatGPT. We show that the large multilingual Transformer -based XLM-R model consistently outperforms all other cross -lingual approaches in zero -shot cross -lingual sentiment classification, surpassing them by at least 3%. Next, we show that the smaller Transformer -based models are comparable in performance to older but much faster methods with linear transformations. The best -performing model with linear transformation achieved an accuracy of 92.1% on the French dataset, compared to 90.3% received by the smaller XLM-R model. Notably, this performance is achieved with just approximately 0.01 of the training time required for the XLM-R model. It underscores the potential of linear transformations as a pragmatic alternative to resource -intensive and slower Transformer -based models in real -world applications. The LLMs achieved impressive results that are on par or better, at least by 1%-3%, but with additional hardware requirements and limitations. Overall, this study contributes to understanding cross -lingual sentiment analysis and provides valuable insights into the strengths and limitations of cross -lingual approaches for sentiment analysis.
引用
收藏
页数:39
相关论文
共 50 条
  • [41] Persian Sentiment Analysis without Training Data Using Cross-Lingual Word Embeddings
    Aliramezani, Mohammad
    Doostmohammadi, Ehsan
    Bokaei, Mohammad Hadi
    Sameti, Hossien
    2020 10TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2020, : 78 - 82
  • [42] Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification
    Fernandez, Alejandro Moreo
    Esuli, Andrea
    Sebastiani, Fabrizio
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 55 : 131 - 163
  • [43] Cross-lingual aspect-based sentiment analysis: A survey on tasks, approaches, and challenges
    Smid, Jakub
    Kral, Pavel
    INFORMATION FUSION, 2025, 120
  • [44] Development of Sentiment Lexicon in Bengali utilizing Corpus and Cross-lingual Resources
    Sazzed, Salim
    2020 IEEE 21ST INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2020), 2020, : 237 - 244
  • [45] Aspect-Level Cross-lingual Sentiment Classification with Constrained SMT
    Lambert, Patrik
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 781 - 787
  • [46] Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification
    Chen, Zhenpeng
    Shen, Sheng
    Hu, Ziniu
    Lu, Xuan
    Mei, Qiaozhu
    Liu, Xuanzhe
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 251 - 262
  • [47] Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification
    Chen, Zhenpeng
    Shen, Sheng
    Hu, Ziniu
    Lu, Xuan
    Mei, Qiaozhu
    Liu, Xuanzhe
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4701 - 4705
  • [48] Zero-Shot Learning for Cross-Lingual News Sentiment Classification
    Pelicon, Andraz
    Pranjic, Marko
    Miljkovic, Dragana
    Skrlj, Blaz
    Pollak, Senja
    APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [49] CL-XABSA: Contrastive Learning for Cross-Lingual Aspect-Based Sentiment Analysis
    Lin, Nankai
    Fu, Yingwen
    Lin, Xiaotian
    Zhou, Dong
    Yang, Aimin
    Jiang, Shengyi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2935 - 2946
  • [50] Embedding Projection for Targeted Cross-Lingual Sentiment: Model Comparisons and a Real-World Study
    Barnes, Jeremy
    Klinger, Roman
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2019, 66 : 691 - 742