Deep-BERT: Transfer Learning for Classifying Multilingual Offensive Texts on Social Media

被引：18

作者：

Wadud, Md Anwar Hussen ^{[1
]}

Mridha, M. F. ^{[1
]}

Shin, Jungpil ^{[2
]}

Nur, Kamruddin ^{[3
]}

Saha, Aloke Kumar ^{[4
]}

机构：

[1] Bangladesh Univ Business & Technol, Dept Comp Sci & Engn, Dhaka, Bangladesh

[2] Univ Aizu, Sch Comp Sci & Engn, Aizu Wakamatsu, Fukushima, Japan

[3] Amer Int Univ Bangladesh, Dept Comp Sci, Dhaka, Bangladesh

[4] Univ Asia Pacific, Dept Comp Sci & Engn, Dhaka, Bangladesh

来源：

COMPUTER SYSTEMS SCIENCE AND ENGINEERING | 2023年 / 44卷 / 02期

关键词：

Offensive text classification; deep convolutional neural network (DCNN); bidirectional encoder representations from transformers (BERT); natural language processing (NLP);

D O I：

10.32604/csse.2023.027841

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Offensive messages on social media, have recently been frequently used to harass and criticize people. In recent studies, many promising algorithms have been developed to identify offensive texts. Most algorithms analyze text in a unidirectional manner, where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences. In addition, there are many separate models for identifying offensive texts based on monolingual and multilingual, but there are a few models that can detect both monolingual and multilingual-based offensive texts. In this study, a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers (Deep-BERT) to identify offensive posts on social media that are used to harass others. This paper explores a variety of ways to deal with multilingualism, including collaborative multilingual and translation-based approaches. Then, the Deep-BERT is tested on the Bengali and English datasets, including the different bidirectional encoder representations from transformers (BERT) pre-trained word-embedding techniques, and found that the proposed DeepBERT's efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%. The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts.

引用

页码：1775 / 1791

页数：17

共 50 条

[41] Exhaustive Study into Machine Learning and Deep Learning Methods for Multilingual Cyberbullying Detection in Bangla and Chittagonian Texts
Mahmud, Tanjim
Ptaszynski, Michal
Masui, Fumito
[J]. ELECTRONICS, 2024, 13 (09)
[42] Opinion Mining From Social Media Short Texts: Does Collective Intelligence Beat Deep Learning?
Tsapatsoulis, Nicolas
Djouvas, Constantinos
[J]. FRONTIERS IN ROBOTICS AND AI, 2019, 5
[43] Comparing Deep Learning Models for Multi-label Classification of Arabic Abusive Texts in Social Media
Azzi, Salma Abid
Zribi, Chiraz Ben Othmane
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES (ICSOFT), 2022, : 374 - 381
[44] Classifying the content of social media images to support cultural ecosystem service assessments using deep learning models
Cardoso, Ana Sofia
Renna, Francesco
Moreno-Llorca, Ricardo
Alcaraz-Segura, Domingo
Tabik, Siham
Ladle, Richard J.
Sofia Vaz, Ana
[J]. ECOSYSTEM SERVICES, 2022, 54
[45] Detection of Arabic offensive language in social media using machine learning models
Mousa, Aya
Shahin, Ismail
Nassif, Ali Bou
Elnagar, Ashraf
[J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
[46] Predicting Headline Effectiveness in Online News Media using Transfer Learning with BERT
Tervonen, Jaakko
Sormunen, Tuomas
Lamsa, Arttu
Peltola, Johannes
Kananen, Heidi
Jarvinen, Sari
[J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON DEEP LEARNING THEORY AND APPLICATIONS (DELTA), 2021, : 29 - 37
[47] DeepEmotex: Classifying Emotion in Text Messages using Deep Transfer Learning
Hasan, Maryam
Rundensteiner, Elke
Agu, Emmanuel
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5143 - 5152
[48] Cyberbullying Detection on Social Media Using Stacking Ensemble Learning and Enhanced BERT
Muneer, Amgad
Alwadain, Ayed
Ragab, Mohammed Gamal
Alqushaibi, Alawi
[J]. INFORMATION, 2023, 14 (08)
[49] A multilingual offensive language detection method based on transfer learning from transformer fine-tuning model
El-Alami, Fatima-zahra
Alaoui, Said Ouatik El
Nahnahi, Noureddine En
[J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6048 - 6056
[50] Classifying Misinformation of User Credibility in Social Media Using Supervised Learning
Asfand-e-Yar, Muhammad
Hashir, Qadeer
Tanvir, Syed Hassan
Khalil, Wajeeha
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 2921 - 2938

← 1 2 3 4 5 →