Zero-shot learning based cross-lingual sentiment analysis for sanskrit text with insufficient labeled data

被引:4
|
作者
Kumar, Puneet [1 ]
Pathania, Kshitij [2 ]
Raman, Balasubramanian [1 ]
机构
[1] Indian Inst Technol Roorkee, Dept Comp Sci & Engn, Roorkee, Uttar Pradesh, India
[2] Indian Inst Technol Roorkee, Dept Math, Roorkee, Uttar Pradesh, India
关键词
Labeled data insufficiency; Cross-lingual sentiment analysis; Sanskrit language analysis; Machine translation;
D O I
10.1007/s10489-022-04046-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel method for analyzing the sentiments portrayed by Sanskrit text has been proposed. Sanskrit is one of the world's most ancient languages; however, natural language processing tasks such as machine translation and sentiment analysis have not been explored for it to the full potential because of the unavailability of sufficient labeled data. We solved this issue using a zero-shot learning-based cross-lingual sentiment analysis (CLSA) approach. The CLSA uses the resources from the source language to enhance the sentiment analysis of the target language having insufficient resources. The proposed work translates the text from Sanskrit, a language with insufficient labeled data, to English, with sufficient labeled data for sentiment analysis using a transformer model. A generative adversarial network-based strategy has been proposed to evaluate the maturity of the translations. Then a bidirectional long short-term memory-based model has been implemented to classify the sentiments using the embeddings obtained through translations. The proposed technique has achieved 87.50% accuracy for machine translation and 92.83% accuracy for sentiment classification. Sanskrit-English translations used in this work have been collected through web scraping techniques. In the absence of the ground-truth sentiment class labels, a strategy for evaluating the sentiment scores of the proposed sentiment analysis model has also been presented. A new dataset of Sanskrit text, along with their English translations and sentiment scores, has been constructed.
引用
收藏
页码:10096 / 10113
页数:18
相关论文
共 50 条
  • [31] Zero-shot Cross-Lingual Phonetic Recognition with External Language Embedding
    Gao, Heting
    Ni, Junrui
    Zhang, Yang
    Qian, Kaizhi
    Chang, Shiyu
    Hasegawa-Johnson, Mark
    [J]. INTERSPEECH 2021, 2021, : 1304 - 1308
  • [32] Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection
    Nozza, Debora
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 907 - 914
  • [33] Realistic Zero-Shot Cross-Lingual Transfer in Legal Topic Classification
    Xenouleas, Stratos
    Tsoukara, Alexia
    Panagiotakis, Giannis
    Chalkidis, Ilias
    Androutsopoulos, Ion
    [J]. PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [34] Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model
    Hsu, Tsung-Yuan
    Liu, Chi-liang
    Lee, Hung-yi
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5933 - 5940
  • [35] A joint learning approach with knowledge injection for zero-shot cross-lingual hate speech detection
    Pamungkas, Endang Wahyu
    Basile, Valerio
    Patti, Viviana
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (04)
  • [36] Multi-level multilingual semantic alignment for zero-shot cross-lingual transfer learning
    Gui, Anchun
    Xiao, Han
    [J]. NEURAL NETWORKS, 2024, 173
  • [37] Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models
    Shaheen, Zein
    Wohlgenannt, Gerhard
    Mouromtsev, Dmitry
    [J]. 2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 450 - 456
  • [38] Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
    Artetxe, Mikel
    Schwenk, Holger
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 : 597 - 610
  • [39] Label modification and bootstrapping for zero-shot cross-lingual hate speech detection
    Irina Bigoulaeva
    Viktor Hangya
    Iryna Gurevych
    Alexander Fraser
    [J]. Language Resources and Evaluation, 2023, 57 : 1515 - 1546
  • [40] Label modification and bootstrapping for zero-shot cross-lingual hate speech detection
    Bigoulaeva, Irina
    Hangya, Viktor
    Gurevych, Iryna
    Fraser, Alexander
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (04) : 1515 - 1546