Cross-Lingual Sentiment Analysis: A Survey

被引：0

作者：

Xu Y. ^{[1
]}

Cao H. ^{[1
]}

Wang W. ^{[1
]}

Du W. ^{[1
]}

Xu C. ^{[1
]}

机构：

[1] School of Information Science and Technology, Beijing Foreign Studies of University, Beijing

来源：

Data Analysis and Knowledge Discovery | 2023年 / 7卷 / 01期

关键词：

Bilingual Word Embedding; Cross Lingual; Multi-lingual; Sentiment Analysis;

D O I：

10.11925/infotech.2096-3467.2022.0472

中图分类号：

学科分类号：

摘要：

[Objective] This paper teases out the research context of cross-lingual sentiment analysis (CLSA). [Coverage] We searched“TS=cross lingual sentiment OR cross lingual word embedding”in Web of Science database and 90 representative papers were chosen for this review. [Methods] We elaborated the following CLSA methods in detail: (1) The early main methods of CLSA, including those based on machine translation and its improved variants, parallel corpora or bilingual sentiment lexicon; (2) CLSA based on cross-lingual word embedding; (3) CLSA based on Multi-BERT and other pre-trained models. [Results] We analyzed their main ideas, methodologies, shortcomings, etc., and attempted to reach a conclusion on the coverage of languages, datasets and their performance. It is found that although pre-trained models such as Multi-BERT have achieved good performance in zero-shot cross-lingual sentiment analysis, some challenges like language sensitivity still exist. Early CLSA methods still have some inspirations for existing researches. [Limitations] Some CLSA models are mixed models and they are classified according to the main methods. [Conclusions] We look into the future development of CLSA and the challenges facing the research area. With in-depth research of pre-trained models on multi-lingual semantics, CLSA models fit for more and wider languages will be the future direction. © 2023, Chinese Academy of Sciences. All rights reserved.

引用

页码：1 / 21

页数：20

共 90 条

[1] Shanahan J G, Grefenstette G, Qu Y, Evans D A., Mining Multilingual Options Through Classification and Translation, Proceeding of AAAI Spring Symposium, (2004)
[2] Wan X J., Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 553-561, (2008)
[3] Vulic I, Moens M F., Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings, Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 363-372, (2015)
[4] Mikolov T, Chen K, Corrado G, Et al., Efficient Estimation of Word Representations in Vector Space
[5] Balahur A, Mihalcea R, Montoyo A., Computational Approaches to Subjectivity and Sentiment Analysis: Present and Envisaged Methods and Applications, Computer Speech & Language, 28, 1, pp. 1-6, (2014)
[6] Banea C, Mihalcea R, Wiebe J, Et al., Multilingual Subjectivity Analysis Using Machine Translation, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 127-135, (2008)
[7] Martin-Valdivia M T, Martinez-Camara E, Perea-Ortega J M, Et al., Sentiment Polarity Detection in Spanish Reviews Combining Supervised and Unsupervised Approaches, Expert Systems with Applications, 40, 10, pp. 3934-3942, (2013)
[8] Prettenhofer P, Stein B., Cross-Language Text Classification Using Structural Correspondence Learning, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1118-1127, (2010)
[9] Wan X J., Co-Training for Cross-Lingual Sentiment Classification [C], Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 235-243, (2009)
[10] Balahur A, Turchi M., Comparative Experiments Using Supervised Learning and Machine Translation for Multilingual Sentiment Analysis[J], Computer Speech & Language, 28, 1, pp. 56-75, (2014)

← 1 2 3 4 5 6 7 8 9 →