Improving stance detection accuracy in low-resource languages: a deep learning framework with ParsBERT

被引:0
|
作者
Rahimi, Mohammad [1 ]
Kiani, Vahid [1 ]
机构
[1] Univ Bojnord, Fac Engn, Comp Engn Dept, Bojnord 9453155111, Iran
关键词
Natural language processing; Stance detection; Persian stance detection; BERT embedding; Sentence pair classification;
D O I
10.1007/s41060-024-00630-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stance detection, the task of identifying the stance or viewpoint expressed in a text, plays a crucial role in understanding the sentiment and credibility of information. However, in low-resource languages such as Persian, the lack of labeled data poses a significant challenge for developing accurate stance detection models. This research article proposes a deep learning approach that leverages BERT-based embeddings and transfer learning techniques to address this challenge. Specifically, we utilize the ParsBERT model, a language-specific BERT model trained on Persian texts, for improved performance on the Persian news stance detection task. In addition, we propose an ensemble classification approach using BERT-based base learners to detect stances in Persian texts. By considering stance detection as a sentence pair classification task and using ParsBERT, we achieve higher accuracy in classifying the stance of Persian texts compared to baseline methods and simpler configurations. Experimental results on a common Persian stance dataset demonstrate the effectiveness of our proposed methods, showcasing the potential of BERT-based embeddings and transfer learning in low-resource languages like Persian. This research contributes to advancing stance detection techniques in Persian text and opens doors for further research in other low-resource languages. The source code and experimental data of this research work will be publicly available at https://github.com/vkiani/stance.
引用
收藏
页码:517 / 535
页数:19
相关论文
共 50 条
  • [21] Cyberbullying detection for low-resource languages and dialects: Review of the state of the art
    Mahmud, Tanjim
    Ptaszynski, Michal
    Eronen, Juuso
    Masui, Fumito
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (05)
  • [22] Exploiting Vocal-Source Features to Improve ASR Accuracy for Low-Resource Languages
    Fernandez, Raul
    Cui, Jia
    Rosenberg, Andrew
    Ramabhadran, Bhuvana
    Cui, Xiaodong
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 805 - 809
  • [23] Low-resource Sinhala Speech Recognition using Deep Learning
    Karunathilaka, Hirunika
    Welgama, Viraj
    Nadungodage, Thilini
    Weerasinghe, Ruvan
    2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 196 - 201
  • [24] Low-resource Deep Entity Resolution with Transfer and Active Learning
    Kasai, Jungo
    Qian, Kun
    Gurajada, Sairam
    Li, Yunyao
    Popa, Lucian
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5851 - 5861
  • [25] Enabling Medical Translation for Low-Resource Languages
    Musleh, Ahmad
    Durrani, Nadir
    Temnikova, Irina
    Nakov, Preslav
    Vogel, Stephan
    Alsaad, Osama
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT II, 2018, 9624 : 3 - 16
  • [26] Discourse annotation guideline for low-resource languages
    Vargas, Francielle
    Schmeisser-Nieto, Wolfgang
    Rabinovich, Zohar
    Pardo, Thiago A. S.
    Benevenuto, Fabricio
    NATURAL LANGUAGE PROCESSING, 2025, 31 (02): : 700 - 743
  • [27] GlotLID: Language Identification for Low-Resource Languages
    Kargaran, Amir Hossein
    Imani, Ayyoob
    Yvon, Francois
    Schuetze, Hinrich
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6155 - 6218
  • [28] Classifying educational materials in low-resource languages
    Sohsah, Gihad N.
    Guzey, Onur
    Tarmanini, Zaina
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 431 - 435
  • [29] Extending Multilingual BERT to Low-Resource Languages
    Wang, Zihan
    Karthikeyan, K.
    Mayhew, Stephen
    Roth, Dan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2649 - 2656
  • [30] Attention is all low-resource languages need
    Poupard, Duncan
    TRANSLATION STUDIES, 2024, 17 (02) : 424 - 427