Improving stance detection accuracy in low-resource languages: a deep learning framework with ParsBERT

被引:0
|
作者
Rahimi, Mohammad [1 ]
Kiani, Vahid [1 ]
机构
[1] Univ Bojnord, Fac Engn, Comp Engn Dept, Bojnord 9453155111, Iran
关键词
Natural language processing; Stance detection; Persian stance detection; BERT embedding; Sentence pair classification;
D O I
10.1007/s41060-024-00630-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stance detection, the task of identifying the stance or viewpoint expressed in a text, plays a crucial role in understanding the sentiment and credibility of information. However, in low-resource languages such as Persian, the lack of labeled data poses a significant challenge for developing accurate stance detection models. This research article proposes a deep learning approach that leverages BERT-based embeddings and transfer learning techniques to address this challenge. Specifically, we utilize the ParsBERT model, a language-specific BERT model trained on Persian texts, for improved performance on the Persian news stance detection task. In addition, we propose an ensemble classification approach using BERT-based base learners to detect stances in Persian texts. By considering stance detection as a sentence pair classification task and using ParsBERT, we achieve higher accuracy in classifying the stance of Persian texts compared to baseline methods and simpler configurations. Experimental results on a common Persian stance dataset demonstrate the effectiveness of our proposed methods, showcasing the potential of BERT-based embeddings and transfer learning in low-resource languages like Persian. This research contributes to advancing stance detection techniques in Persian text and opens doors for further research in other low-resource languages. The source code and experimental data of this research work will be publicly available at https://github.com/vkiani/stance.
引用
收藏
页码:517 / 535
页数:19
相关论文
共 50 条
  • [1] Hypernymy Detection for Low-Resource Languages via Meta Learning
    Yu, Changlong
    Hang, Jialong
    Zhang, Haisong
    Ng, Wilfred
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3651 - 3656
  • [2] Collaborative Knowledge Infusion for Low-Resource Stance Detection
    Yan, Ming
    Joey, Tianyi Zhou
    Ivor, W. Tsang
    BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 682 - 698
  • [3] Improving NER Tagging Performance in Low-Resource Languages via Multilingual Learning
    Murthy, Rudra
    Khapra, Mitesh M.
    Bhattacharyya, Pushpak
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (02)
  • [4] IMPROVING CAPTIONING FOR LOW-RESOURCE LANGUAGES BY CYCLE CONSISTENCY
    Wu, Yike
    Zhao, Shiwan
    Chen, Jia
    Zhang, Ying
    Yuan, Xiaojie
    Su, Zhong
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 362 - 367
  • [5] A Deep Learning Sentiment Analyser for Social Media Comments in Low-Resource Languages
    Kastrati, Zenun
    Ahmedi, Lule
    Kurti, Arianit
    Kadriu, Fatbardh
    Murtezaj, Doruntina
    Gashi, Fatbardh
    ELECTRONICS, 2021, 10 (10)
  • [6] LEARNING FROM THE BEST: A TEACHER-STUDENT MULTILINGUAL FRAMEWORK FOR LOW-RESOURCE LANGUAGES
    Bagchi, Deblin
    Hartmann, William
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6051 - 6055
  • [7] Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets
    Morfi, Veronica
    Stowell, Dan
    APPLIED SCIENCES-BASEL, 2018, 8 (08):
  • [8] Improving Low-Resource Chinese Event Detection with Multi-task Learning
    Tong, Meihan
    Xu, Bin
    Wang, Shuai
    Hou, Lei
    Li, Juaizi
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT I, 2020, 12274 : 421 - 433
  • [9] Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages
    Ullah, Imran
    Ullah, Khalil
    Khan, Hamad
    Aurangzeb, Khursheed
    Anwar, Muhammad Shahid
    Syed, Ikram
    PeerJ Computer Science, 2024, 10 : 1 - 23
  • [10] Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages
    Ullah, Imran
    Ullah, Khalil
    Khan, Hamad
    Aurangzeb, Khursheed
    Anwar, Muhammad Shahid
    Syed, Ikram
    PEERJ COMPUTER SCIENCE, 2024, 10