Improving stance detection accuracy in low-resource languages: a deep learning framework with ParsBERT

被引:0
|
作者
Rahimi, Mohammad [1 ]
Kiani, Vahid [1 ]
机构
[1] Univ Bojnord, Fac Engn, Comp Engn Dept, Bojnord 9453155111, Iran
关键词
Natural language processing; Stance detection; Persian stance detection; BERT embedding; Sentence pair classification;
D O I
10.1007/s41060-024-00630-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stance detection, the task of identifying the stance or viewpoint expressed in a text, plays a crucial role in understanding the sentiment and credibility of information. However, in low-resource languages such as Persian, the lack of labeled data poses a significant challenge for developing accurate stance detection models. This research article proposes a deep learning approach that leverages BERT-based embeddings and transfer learning techniques to address this challenge. Specifically, we utilize the ParsBERT model, a language-specific BERT model trained on Persian texts, for improved performance on the Persian news stance detection task. In addition, we propose an ensemble classification approach using BERT-based base learners to detect stances in Persian texts. By considering stance detection as a sentence pair classification task and using ParsBERT, we achieve higher accuracy in classifying the stance of Persian texts compared to baseline methods and simpler configurations. Experimental results on a common Persian stance dataset demonstrate the effectiveness of our proposed methods, showcasing the potential of BERT-based embeddings and transfer learning in low-resource languages like Persian. This research contributes to advancing stance detection techniques in Persian text and opens doors for further research in other low-resource languages. The source code and experimental data of this research work will be publicly available at https://github.com/vkiani/stance.
引用
收藏
页码:517 / 535
页数:19
相关论文
共 50 条
  • [31] Neighbors helping the poor: improving low-resource machine translation using related languages
    Pourdamghani, Nima
    Knight, Kevin
    MACHINE TRANSLATION, 2019, 33 (03) : 239 - 258
  • [32] SUPERVISED AND UNSUPERVISED ACTIVE LEARNING FOR AUTOMATIC SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGES
    Syed, Ali Raza
    Rosenberg, Andrew
    Kislal, Ellen
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5320 - 5324
  • [33] Crosslingual Transfer Learning for Low-Resource Languages Based on Multilingual Colexification Graphs
    Liu, Yihong
    Ye, Haotian
    Weissweiler, Leonie
    Pei, Renhao
    Schuetze, Hinrich
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8376 - 8401
  • [34] Leveraging Additional Resources for Improving Statistical Machine Translation on Asian Low-Resource Languages
    Hai-Long Trieu
    Duc-Vu Tran
    Ittoo, Ashwin
    Le-Minh Nguyen
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (03)
  • [35] Improving preterm newborn identification in low-resource settings with machine learning
    Rittenhouse, Katelyn J.
    Vwalika, Bellington
    Keil, Alexander
    Winston, Jennifer
    Stoner, Marie
    Price, Joan T.
    Kapasa, Monica
    Mubambe, Mulaya
    Banda, Vanilla
    Muunga, Whyson
    Stringer, Jeffrey S. A.
    PLOS ONE, 2019, 14 (02):
  • [36] Deep Persian sentiment analysis: Cross-lingual training for low-resource languages
    Ghasemi, Rouzbeh
    Ashrafi Asli, Seyed Arad
    Momtazi, Saeedeh
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (04) : 449 - 462
  • [37] Deep Ensemble Network for Sentiment Analysis in Bi-lingual Low-resource Languages
    Roy, Pradeep Kumar
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (01)
  • [38] Multitask Learning of Deep Neural Networks for Low-Resource Speech Recognition
    Chen, Dongpeng
    Mak, Brian Kan-Wing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (07) : 1172 - 1183
  • [39] Combating Fake News in "Low-Resource" Languages: Amharic Fake News Detection Accompanied by Resource Crafting
    Gereme, Fantahun
    Zhu, William
    Ayall, Tewodros
    Alemu, Dagmawi
    INFORMATION, 2021, 12 (01) : 1 - 9
  • [40] Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages
    Upadhyay, Shyam
    Kodner, Jordan
    Roth, Dan
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 501 - 511