A comparative study of cross-lingual sentiment analysis

被引:6
|
作者
Priban, Pavel [1 ,2 ]
Smid, Jakub [1 ]
Steinberger, Josef [1 ]
Mistera, Adam [1 ]
机构
[1] Univ West Bohemia, Fac Appl Sci, Dept Comp Sci & Engn, Univ 8, Plzen 30100, Czech Republic
[2] NTIS New Technol Informat Soc, Univ 8, Plzen 30100, Czech Republic
关键词
Sentiment analysis; Zero-shot cross-lingual classification; Linear transformation; Transformers; Large language models; Transfer learning;
D O I
10.1016/j.eswa.2024.123247
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a detailed comparative study of the zero -shot cross -lingual sentiment analysis. Namely, we use modern multilingual Transformer -based models and linear transformations combined with CNN and LSTM neural networks. We evaluate their performance in Czech, French, and English. We aim to compare and assess the models' ability to transfer knowledge across languages and discuss the trade-off between their performance and training/inference speed. We build strong monolingual baselines comparable with the current SotA approaches, achieving state-of-the-art results in Czech (96.0% accuracy) and French (97.6% accuracy). Next, we compare our results with the latest large language models (LLMs), i.e., Llama 2 and ChatGPT. We show that the large multilingual Transformer -based XLM-R model consistently outperforms all other cross -lingual approaches in zero -shot cross -lingual sentiment classification, surpassing them by at least 3%. Next, we show that the smaller Transformer -based models are comparable in performance to older but much faster methods with linear transformations. The best -performing model with linear transformation achieved an accuracy of 92.1% on the French dataset, compared to 90.3% received by the smaller XLM-R model. Notably, this performance is achieved with just approximately 0.01 of the training time required for the XLM-R model. It underscores the potential of linear transformations as a pragmatic alternative to resource -intensive and slower Transformer -based models in real -world applications. The LLMs achieved impressive results that are on par or better, at least by 1%-3%, but with additional hardware requirements and limitations. Overall, this study contributes to understanding cross -lingual sentiment analysis and provides valuable insights into the strengths and limitations of cross -lingual approaches for sentiment analysis.
引用
收藏
页数:39
相关论文
共 50 条
  • [31] Exploring the Cross-Lingual Similarity of Valmiki Ramayana Using Semantic and Sentiment Analysis
    Kulkarni, Pooja
    Birajdar, Gajanan K.
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2025,
  • [32] Improving Cross-lingual Aspect-based Sentiment Analysis with Sememe Bridge
    Liu, Yijiang
    Li, Fei
    Ji, Donghong
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (12)
  • [33] Cross-lingual Alignment Methods for Multilingual BERT: A Comparative Study
    Kulshreshtha, Saurabh
    Redondo-Garcia, Jose Luis
    Chang, Ching-Yun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 933 - 942
  • [34] Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification
    He, Yulan
    ADVANCES IN INFORMATION RETRIEVAL, 2011, 6611 : 214 - 225
  • [35] Coarse Alignment of Topic and Sentiment: A Unified Model for Cross-Lingual Sentiment Classification
    Wang, Deqing
    Jing, Baoyu
    Lu, Chenwei
    Wu, Junjie
    Liu, Guannan
    Du, Chenguang
    Zhuang, Fuzhen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) : 736 - 747
  • [36] Cross-Lingual Blog Analysis by Cross-Lingual Comparison of Characteristic Terms and Blog Posts
    Nakasaki, Hiroyuki
    Kawaba, Mariko
    Utsuro, Takehito
    Fukuhara, Tomohiro
    Nakagawa, Hiroshi
    Kando, Noriko
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 105 - +
  • [37] An Unsupervised Cross-Lingual Topic Model Framework for Sentiment Classification
    Lin, Zheng
    Jin, Xiaolong
    Xu, Xueke
    Wang, Yuanzhuo
    Cheng, Xueqi
    Wang, Weiping
    Meng, Dan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (03) : 432 - 444
  • [38] A cross-lingual sentiment topic model evolution over time
    Musa, Ibrahim Hussein
    Xu, Kang
    Liu, Feng
    Zamit, Ibrahim
    Abro, Waheed Ahmed
    Qi, Guilin
    INTELLIGENT DATA ANALYSIS, 2020, 24 (02) : 253 - 266
  • [39] Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning
    Zhou, Xinjie
    Wan, Xianjun
    Xiao, Jianguo
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1403 - 1412
  • [40] Deep Persian sentiment analysis: Cross-lingual training for low-resource languages
    Ghasemi, Rouzbeh
    Ashrafi Asli, Seyed Arad
    Momtazi, Saeedeh
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (04) : 449 - 462