BERT-based Approach to Arabic Hate Speech and Offensive Language Detection in Twitter: Exploiting Emojis and Sentiment Analysis

被引:0
|
作者
Althobaiti, Maha Jarallah [1 ]
机构
[1] Taif Univ, Coll Comp & Informat Technol, Dept Comp Sci, Taif 21944, Saudi Arabia
关键词
Deep learning; hate speech detection; offensive language detection; sentiment analysis; transformer-based model; BERT; emoji; SOCIAL MEDIA;
D O I
10.14569/IJACSA.2022.01305109
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The user-generated content on the internet including that on social media may contain offensive language and hate speech which negatively affect the mental health of the whole internet society and may lead to hate crimes. Intelligent models for automatic detection of offensive language and hate speech have attracted significant attention recently. In this paper, we propose an automatic method for detecting offensive language and fine-grained hate speech from Arabic tweets. We compare between BERT and two conventional machine learning techniques (SVM, logistic regression). We also investigate the use of sentiment analysis and emojis descriptions as appending features along with the textual content of the tweets. The experiments shows that BERT-based model gives the best results, surpassing the best benchmark systems in the literature, on all three tasks: (a) offensive language detection with 84.3% F1-score, (b) hate speech detection with 81.8% F1-score, and (c) fine-grained hate-speech recognition (e.g., race, religion, social class, etc.) with 45.1% F1-score. The use of sentiment analysis slightly improves the performance of the models when detecting offensive language and hate speech but has no positive effect on the performance of the models when recognising the type of the hate speech. The use of textual emoji description as features can improve or deteriorate the performance of the models depending on the size of the examples per class and whether the emojis are considered among distinctive features between classes or not.
引用
收藏
页码:972 / 980
页数:9
相关论文
共 32 条
  • [1] Emojis as anchors to detect Arabic offensive language and hate speech
    Mubarak, Hamdy
    Hassan, Sabit
    Chowdhury, Shammur Absar
    [J]. NATURAL LANGUAGE ENGINEERING, 2023, 29 (06) : 1436 - 1457
  • [2] Advancing offensive language detection in Arabic social media: a BERT-based ensemble learning approach
    Mazari, Ahmed Cherif
    Benterkia, Asmaa
    Takdenti, Zineb
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [3] BERT-based Ensemble Approaches for Hate Speech Detection
    Mnassri, Khouloud
    Rajapaksha, Praboda
    Farahbakhsh, Reza
    Crespi, Noel
    [J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 4649 - 4654
  • [4] A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media
    Mozafari, Marzieh
    Farahbakhsh, Reza
    Crespi, Noel
    [J]. COMPLEX NETWORKS AND THEIR APPLICATIONS VIII, VOL 1, 2020, 881 : 928 - 940
  • [5] Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection
    Watanabe, Hajime
    Bouazizi, Mondher
    Ohtsuki, Tomoaki
    [J]. IEEE ACCESS, 2018, 6 : 13825 - 13835
  • [7] BERT-based ensemble learning for multi-aspect hate speech detection
    Mazari, Ahmed Cherif
    Boudoukhani, Nesrine
    Djeffal, Abdelhamid
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 325 - 339
  • [8] BERT-based ensemble learning for multi-aspect hate speech detection
    Ahmed Cherif Mazari
    Nesrine Boudoukhani
    Abdelhamid Djeffal
    [J]. Cluster Computing, 2024, 27 : 325 - 339
  • [9] BERT-Based Logits Ensemble Model for Gender Bias and Hate Speech Detection
    Yun, Sanggeon
    Kang, Seungshik
    Kim, Hyeokman
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (05): : 641 - 651
  • [10] An Effective BERT-Based Pipeline for Twitter Sentiment Analysis: A Case Study in Italian
    Pota, Marco
    Ventura, Mirko
    Catelli, Rosario
    Esposito, Massimo
    [J]. SENSORS, 2021, 21 (01) : 1 - 21