Cross-Lingual Sentiment Analysis: A Survey

被引:0
|
作者
Xu Y. [1 ]
Cao H. [1 ]
Wang W. [1 ]
Du W. [1 ]
Xu C. [1 ]
机构
[1] School of Information Science and Technology, Beijing Foreign Studies of University, Beijing
关键词
Bilingual Word Embedding; Cross Lingual; Multi-lingual; Sentiment Analysis;
D O I
10.11925/infotech.2096-3467.2022.0472
中图分类号
学科分类号
摘要
[Objective] This paper teases out the research context of cross-lingual sentiment analysis (CLSA). [Coverage] We searched“TS=cross lingual sentiment OR cross lingual word embedding”in Web of Science database and 90 representative papers were chosen for this review. [Methods] We elaborated the following CLSA methods in detail: (1) The early main methods of CLSA, including those based on machine translation and its improved variants, parallel corpora or bilingual sentiment lexicon; (2) CLSA based on cross-lingual word embedding; (3) CLSA based on Multi-BERT and other pre-trained models. [Results] We analyzed their main ideas, methodologies, shortcomings, etc., and attempted to reach a conclusion on the coverage of languages, datasets and their performance. It is found that although pre-trained models such as Multi-BERT have achieved good performance in zero-shot cross-lingual sentiment analysis, some challenges like language sensitivity still exist. Early CLSA methods still have some inspirations for existing researches. [Limitations] Some CLSA models are mixed models and they are classified according to the main methods. [Conclusions] We look into the future development of CLSA and the challenges facing the research area. With in-depth research of pre-trained models on multi-lingual semantics, CLSA models fit for more and wider languages will be the future direction. © 2023, Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:1 / 21
页数:20
相关论文
共 90 条
  • [1] Shanahan J G, Grefenstette G, Qu Y, Evans D A., Mining Multilingual Options Through Classification and Translation, Proceeding of AAAI Spring Symposium, (2004)
  • [2] Wan X J., Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 553-561, (2008)
  • [3] Vulic I, Moens M F., Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings, Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 363-372, (2015)
  • [4] Mikolov T, Chen K, Corrado G, Et al., Efficient Estimation of Word Representations in Vector Space
  • [5] Balahur A, Mihalcea R, Montoyo A., Computational Approaches to Subjectivity and Sentiment Analysis: Present and Envisaged Methods and Applications, Computer Speech & Language, 28, 1, pp. 1-6, (2014)
  • [6] Banea C, Mihalcea R, Wiebe J, Et al., Multilingual Subjectivity Analysis Using Machine Translation, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 127-135, (2008)
  • [7] Martin-Valdivia M T, Martinez-Camara E, Perea-Ortega J M, Et al., Sentiment Polarity Detection in Spanish Reviews Combining Supervised and Unsupervised Approaches, Expert Systems with Applications, 40, 10, pp. 3934-3942, (2013)
  • [8] Prettenhofer P, Stein B., Cross-Language Text Classification Using Structural Correspondence Learning, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1118-1127, (2010)
  • [9] Wan X J., Co-Training for Cross-Lingual Sentiment Classification [C], Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 235-243, (2009)
  • [10] Balahur A, Turchi M., Comparative Experiments Using Supervised Learning and Machine Translation for Multilingual Sentiment Analysis[J], Computer Speech & Language, 28, 1, pp. 56-75, (2014)