Fine-Tuning Transformer Models Using Transfer Learning for Multilingual Threatening Text Identification

被引:6
|
作者
Rehan, Muhammad [1 ]
Malik, Muhammad Shahid Iqbal [2 ]
Jamjoom, Mona Mamdouh [3 ]
机构
[1] Capital Univ Sci & Technol, Dept Comp Sci, Islamabad 44000, Pakistan
[2] Natl Res Univ Higher Sch Econ, Dept Comp Sci, Moscow 109028, Russia
[3] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh 11671, Saudi Arabia
关键词
Multi-lingual; Urdu; XLM-RoBERTa; threatening text; fine-tunning; MuRIL; LANGUAGE DETECTION;
D O I
10.1109/ACCESS.2023.3320062
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Threatening content detection on social media has recently gained attention. There is very limited work regarding threatening content detection in low-resource languages, especially in Urdu. Furthermore, previous work explored only mono-lingual approaches, and multi-lingual threatening content detection was not studied. This research addressed the task of Multi-lingual Threatening Content Detection (MTCD) in Urdu and English languages by exploiting transfer learning methodology with fine-tuning techniques. To address the multi-lingual task, we investigated two methodologies: 1) Joint multi-lingual, and 2) Joint-translated method. The former approach employs the concept of building a universal classifier for different languages whereas the latter approach applies the translation process to transform the text into one language and then perform classification. We explore the Multilingual Representations for Indian Languages (MuRIL) and Robustly Optimized BERT Pre-Training Approach (RoBERTa) with fine-tuning that already demonstrated state-of-the-art in capturing the contextual and semantic characteristics within the text. For hyper-parameters, manual search and grid search strategies are utilized to find the optimum values. Various experiments are performed on bi-lingual English and Urdu datasets and findings revealed that the proposed methodology outperformed the baselines and showed benchmark performance. The RoBERTa model achieved the highest performance by demonstrating 92% accuracy and 90% macro F1-score with the joint multi-lingual approach.
引用
收藏
页码:106503 / 106515
页数:13
相关论文
共 50 条
  • [21] Multilingual fine-tuning for Grammatical Error Correction
    Pajak, Krzysztof
    Pajak, Dominik
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 200
  • [22] Extreme Fine-tuning: A Novel and Fast Fine-tuning Approach for Text Classification
    Jiaramaneepinit, Boonnithi
    Chay-intr, Thodsaporn
    Funakoshi, Kotaro
    Okumura, Manabu
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 368 - 379
  • [23] SpotTune: Transfer Learning through Adaptive Fine-tuning
    Guo, Yunhui
    Shi, Honghui
    Kumar, Abhishek
    Grauman, Kristen
    Rosing, Tajana
    Feris, Rogerio
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4800 - 4809
  • [24] Adaptive Fine-Tuning Based Transfer Learning for the Identification of MGMT Promoter Methylation Status
    Schmitz, E. J.
    Guo, Y.
    Wang, J.
    MEDICAL PHYSICS, 2024, 51 (10) : 7728 - 7728
  • [25] Adaptive fine-tuning based transfer learning for the identification of MGMT promoter methylation status
    Schmitz, Erich
    Guo, Yunhui
    Wang, Jing
    BIOMEDICAL PHYSICS & ENGINEERING EXPRESS, 2024, 10 (05):
  • [26] How fine can fine-tuning be? Learning efficient language models
    Radiya-Dixit, Evani
    Wang, Xin
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2435 - 2442
  • [27] Fine-tuning pretrained transformer encoders for sequence-to-sequence learning
    Bao, Hangbo
    Dong, Li
    Wang, Wenhui
    Yang, Nan
    Piao, Songhao
    Wei, Furu
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (05) : 1711 - 1728
  • [28] Automatic segmentation of melanoma skin cancer using transfer learning and fine-tuning
    Rafael Luz Araújo
    Flávio H. D. de Araújo
    Romuere R. V. e Silva
    Multimedia Systems, 2022, 28 : 1239 - 1250
  • [29] Transfer Learning for Sentiment Analysis Using BERT Based Supervised Fine-Tuning
    Prottasha, Nusrat Jahan
    Sami, Abdullah As
    Kowsher, Md
    Murad, Saydul Akbar
    Bairagi, Anupam Kumar
    Masud, Mehedi
    Baz, Mohammed
    SENSORS, 2022, 22 (11)
  • [30] Enhancement of Video Anomaly Detection Performance Using Transfer Learning and Fine-Tuning
    Dilek, Esma
    Dener, Murat
    IEEE ACCESS, 2024, 12 : 73304 - 73322