BERT-based Approach to Arabic Hate Speech and Offensive Language Detection in Twitter: Exploiting Emojis and Sentiment Analysis

被引：0

作者：

Althobaiti, Maha Jarallah ^{[1
]}

机构：

[1] Taif Univ, Coll Comp & Informat Technol, Dept Comp Sci, Taif 21944, Saudi Arabia

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2022年 / 13卷 / 05期

关键词：

Deep learning; hate speech detection; offensive language detection; sentiment analysis; transformer-based model; BERT; emoji; SOCIAL MEDIA;

D O I：

10.14569/IJACSA.2022.01305109

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The user-generated content on the internet including that on social media may contain offensive language and hate speech which negatively affect the mental health of the whole internet society and may lead to hate crimes. Intelligent models for automatic detection of offensive language and hate speech have attracted significant attention recently. In this paper, we propose an automatic method for detecting offensive language and fine-grained hate speech from Arabic tweets. We compare between BERT and two conventional machine learning techniques (SVM, logistic regression). We also investigate the use of sentiment analysis and emojis descriptions as appending features along with the textual content of the tweets. The experiments shows that BERT-based model gives the best results, surpassing the best benchmark systems in the literature, on all three tasks: (a) offensive language detection with 84.3% F1-score, (b) hate speech detection with 81.8% F1-score, and (c) fine-grained hate-speech recognition (e.g., race, religion, social class, etc.) with 45.1% F1-score. The use of sentiment analysis slightly improves the performance of the models when detecting offensive language and hate speech but has no positive effect on the performance of the models when recognising the type of the hate speech. The use of textual emoji description as features can improve or deteriorate the performance of the models depending on the size of the examples per class and whether the emojis are considered among distinctive features between classes or not.

引用

页码：972 / 980

页数：9

共 35 条

[21] A Multi-Task Learning Approach to Hate Speech Detection Leveraging Sentiment Analysis
Plaza-Del-Arco, Flor Miriam
Molina-Gonzalez, M. Dolores
Urena-Lopez, L. Alfonso
Martin-Valdivia, Maria Teresa
IEEE ACCESS, 2021, 9 : 112478 - 112489
[22] A multi-task learning approach to hate speech detection leveraging sentiment analysis
Plaza-Del-Arco, Flor Miriam
Molina-Gonzalez, M. Dolores
Urena-Lopez, L. Alfonso
Martin-Valdivia, Maria Teresa
IEEE Access, 2021, 9 : 112478 - 112489
[23] BERT-Based Model for Aspect-Based Sentiment Analysis for Analyzing Arabic Open-Ended Survey Responses: A Case Study
Alshaikh, Khloud A.
Almatrafi, Omaima A.
Abushark, Yoosef B.
IEEE ACCESS, 2024, 12 : 2288 - 2302
[24] Twitter sentiment analysis: An Arabic text mining approach based on COVID-19
Albahli, Saleh
FRONTIERS IN PUBLIC HEALTH, 2022, 10
[25] Detection of Hate Speech and Offensive Language CodeMix Text in Dravidian Languages Using Cost-Sensitive Learning Approach
Sreelakshmi, K.
Premjith, B.
Chakravarthi, Bharathi Raja
Soman, K. P.
IEEE ACCESS, 2024, 12 : 20064 - 20090
[26] Aspect-based Sentiment Analysis and Location Detection for Arabic Language Tweets
AlShammari, Norah
AlMansour, Amal
APPLIED COMPUTER SYSTEMS, 2022, 27 (02) : 119 - 127
[27] Fine-Grained Sentiment Analysis of Arabic COVID-19 Tweets Using BERT-Based Transformers and Dynamically Weighted Loss Function
Alturayeif, Nora
Luqman, Hamzah
APPLIED SCIENCES-BASEL, 2021, 11 (22):
[28] A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts
Zhao, Lingyun
Li, Lin
Zheng, Xinhao
Zhang, Jianwei
PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2021, : 1233 - 1238
[29] tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali language
Ahmed, Shihab
Samia, Moythry Manir
Sayma, Maksuda Haider
Kabir, Md. Mohsin
Mridha, M. F.
PLOS ONE, 2024, 19 (09):
[30] A Hybrid Approach to Dimensional Aspect-Based Sentiment Analysis Using BERT and Large Language Models
Zhang, Yice
Xu, Hongling
Zhang, Delong
Xu, Ruifeng
ELECTRONICS, 2024, 13 (18)

← 1 2 3 4 →