Deep-BERT: Transfer Learning for Classifying Multilingual Offensive Texts on Social Media

被引:18
|
作者
Wadud, Md Anwar Hussen [1 ]
Mridha, M. F. [1 ]
Shin, Jungpil [2 ]
Nur, Kamruddin [3 ]
Saha, Aloke Kumar [4 ]
机构
[1] Bangladesh Univ Business & Technol, Dept Comp Sci & Engn, Dhaka, Bangladesh
[2] Univ Aizu, Sch Comp Sci & Engn, Aizu Wakamatsu, Fukushima, Japan
[3] Amer Int Univ Bangladesh, Dept Comp Sci, Dhaka, Bangladesh
[4] Univ Asia Pacific, Dept Comp Sci & Engn, Dhaka, Bangladesh
来源
关键词
Offensive text classification; deep convolutional neural network (DCNN); bidirectional encoder representations from transformers (BERT); natural language processing (NLP);
D O I
10.32604/csse.2023.027841
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Offensive messages on social media, have recently been frequently used to harass and criticize people. In recent studies, many promising algorithms have been developed to identify offensive texts. Most algorithms analyze text in a unidirectional manner, where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences. In addition, there are many separate models for identifying offensive texts based on monolingual and multilingual, but there are a few models that can detect both monolingual and multilingual-based offensive texts. In this study, a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers (Deep-BERT) to identify offensive posts on social media that are used to harass others. This paper explores a variety of ways to deal with multilingualism, including collaborative multilingual and translation-based approaches. Then, the Deep-BERT is tested on the Bengali and English datasets, including the different bidirectional encoder representations from transformers (BERT) pre-trained word-embedding techniques, and found that the proposed DeepBERT's efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%. The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts.
引用
收藏
页码:1775 / 1791
页数:17
相关论文
共 50 条
  • [41] Exhaustive Study into Machine Learning and Deep Learning Methods for Multilingual Cyberbullying Detection in Bangla and Chittagonian Texts
    Mahmud, Tanjim
    Ptaszynski, Michal
    Masui, Fumito
    [J]. ELECTRONICS, 2024, 13 (09)
  • [42] Opinion Mining From Social Media Short Texts: Does Collective Intelligence Beat Deep Learning?
    Tsapatsoulis, Nicolas
    Djouvas, Constantinos
    [J]. FRONTIERS IN ROBOTICS AND AI, 2019, 5
  • [43] Comparing Deep Learning Models for Multi-label Classification of Arabic Abusive Texts in Social Media
    Azzi, Salma Abid
    Zribi, Chiraz Ben Othmane
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES (ICSOFT), 2022, : 374 - 381
  • [44] Classifying the content of social media images to support cultural ecosystem service assessments using deep learning models
    Cardoso, Ana Sofia
    Renna, Francesco
    Moreno-Llorca, Ricardo
    Alcaraz-Segura, Domingo
    Tabik, Siham
    Ladle, Richard J.
    Sofia Vaz, Ana
    [J]. ECOSYSTEM SERVICES, 2022, 54
  • [45] Detection of Arabic offensive language in social media using machine learning models
    Mousa, Aya
    Shahin, Ismail
    Nassif, Ali Bou
    Elnagar, Ashraf
    [J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
  • [46] Predicting Headline Effectiveness in Online News Media using Transfer Learning with BERT
    Tervonen, Jaakko
    Sormunen, Tuomas
    Lamsa, Arttu
    Peltola, Johannes
    Kananen, Heidi
    Jarvinen, Sari
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON DEEP LEARNING THEORY AND APPLICATIONS (DELTA), 2021, : 29 - 37
  • [47] DeepEmotex: Classifying Emotion in Text Messages using Deep Transfer Learning
    Hasan, Maryam
    Rundensteiner, Elke
    Agu, Emmanuel
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5143 - 5152
  • [48] Cyberbullying Detection on Social Media Using Stacking Ensemble Learning and Enhanced BERT
    Muneer, Amgad
    Alwadain, Ayed
    Ragab, Mohammed Gamal
    Alqushaibi, Alawi
    [J]. INFORMATION, 2023, 14 (08)
  • [49] A multilingual offensive language detection method based on transfer learning from transformer fine-tuning model
    El-Alami, Fatima-zahra
    Alaoui, Said Ouatik El
    Nahnahi, Noureddine En
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6048 - 6056
  • [50] Classifying Misinformation of User Credibility in Social Media Using Supervised Learning
    Asfand-e-Yar, Muhammad
    Hashir, Qadeer
    Tanvir, Syed Hassan
    Khalil, Wajeeha
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 2921 - 2938