Fine-Tuning Transformer Models Using Transfer Learning for Multilingual Threatening Text Identification

被引:6
|
作者
Rehan, Muhammad [1 ]
Malik, Muhammad Shahid Iqbal [2 ]
Jamjoom, Mona Mamdouh [3 ]
机构
[1] Capital Univ Sci & Technol, Dept Comp Sci, Islamabad 44000, Pakistan
[2] Natl Res Univ Higher Sch Econ, Dept Comp Sci, Moscow 109028, Russia
[3] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh 11671, Saudi Arabia
关键词
Multi-lingual; Urdu; XLM-RoBERTa; threatening text; fine-tunning; MuRIL; LANGUAGE DETECTION;
D O I
10.1109/ACCESS.2023.3320062
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Threatening content detection on social media has recently gained attention. There is very limited work regarding threatening content detection in low-resource languages, especially in Urdu. Furthermore, previous work explored only mono-lingual approaches, and multi-lingual threatening content detection was not studied. This research addressed the task of Multi-lingual Threatening Content Detection (MTCD) in Urdu and English languages by exploiting transfer learning methodology with fine-tuning techniques. To address the multi-lingual task, we investigated two methodologies: 1) Joint multi-lingual, and 2) Joint-translated method. The former approach employs the concept of building a universal classifier for different languages whereas the latter approach applies the translation process to transform the text into one language and then perform classification. We explore the Multilingual Representations for Indian Languages (MuRIL) and Robustly Optimized BERT Pre-Training Approach (RoBERTa) with fine-tuning that already demonstrated state-of-the-art in capturing the contextual and semantic characteristics within the text. For hyper-parameters, manual search and grid search strategies are utilized to find the optimum values. Various experiments are performed on bi-lingual English and Urdu datasets and findings revealed that the proposed methodology outperformed the baselines and showed benchmark performance. The RoBERTa model achieved the highest performance by demonstrating 92% accuracy and 90% macro F1-score with the joint multi-lingual approach.
引用
收藏
页码:106503 / 106515
页数:13
相关论文
共 50 条
  • [31] Brain tumor classification for MR images using transfer learning and fine-tuning
    Swati, Zar Nawab Khan
    Zhao, Qinghua
    Kabir, Muhammad
    Ali, Farman
    Ali, Zakir
    Ahmed, Saeed
    Lu, Jianfeng
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2019, 75 : 34 - 46
  • [32] Automatic segmentation of melanoma skin cancer using transfer learning and fine-tuning
    Araujo, Rafael Luz
    de Araujo, Flavio H. D.
    e Silva, Romuere R., V
    MULTIMEDIA SYSTEMS, 2022, 28 (04) : 1239 - 1250
  • [33] Fine-tuning pretrained transformer encoders for sequence-to-sequence learning
    Hangbo Bao
    Li Dong
    Wenhui Wang
    Nan Yang
    Songhao Piao
    Furu Wei
    International Journal of Machine Learning and Cybernetics, 2024, 15 : 1711 - 1728
  • [34] Automatic Cauliflower Disease Detection Using Fine-Tuning Transfer Learning Approach
    Noamaan Abdul Azeem
    Sanjeev Sharma
    Anshul Verma
    SN Computer Science, 5 (7)
  • [35] Fine-Tuning Language Models For Semi-Supervised Text Mining
    Chen, Xinyu
    Beaver, Ian
    Freeman, Cynthia
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3608 - 3617
  • [36] MMSFT: Multilingual Multimodal Summarization by Fine-Tuning Transformers
    Phani, Siginamsetty
    Abdul, Ashu
    Krishna Siva Prasad, M.
    Kumar Deva Sarma, Hiren
    IEEE ACCESS, 2024, 12 : 129673 - 129689
  • [37] Active Learning for Effectively Fine-Tuning Transfer Learning to Downstream Task
    Abul Bashar, Md
    Nayak, Richi
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (02)
  • [38] Image–Text Sentiment Analysis Via Context Guided Adaptive Fine-Tuning Transformer
    Xingwang Xiao
    Yuanyuan Pu
    Zhengpeng Zhao
    Rencan Nie
    Dan Xu
    Wenhua Qian
    Hao Wu
    Neural Processing Letters, 2023, 55 : 2103 - 2125
  • [39] Multilingual Text Summarization for German Texts Using Transformer Models
    Alcantara, Tomas Humberto Montiel
    Krutli, David
    Ravada, Revathi
    Hanne, Thomas
    INFORMATION, 2023, 14 (06)
  • [40] Transfer Learning Gaussian Anomaly Detection by Fine-tuning Representations
    Rippel, Oliver
    Chavan, Arnav
    Lei, Chucai
    Merhof, Dorit
    IMPROVE: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND VISION ENGINEERING, 2022, : 45 - 56