Code-mixing unveiled: Enhancing the hate speech detection in Arabic dialect tweets using machine learning models

被引:0
|
作者
Alhazmi, Ali [1 ,2 ]
Mahmud, Rohana [1 ]
Idris, Norisma [1 ]
Abo, Mohamed Elhag Mohamed [1 ]
Eke, Christopher Ifeanyi [3 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur, Malaysia
[2] Jazan Univ, Coll Engn & Comp Sci, Dept Comp Sci, Jazan, Saudi Arabia
[3] Fed Univ Lafia, Fac Comp, Dept Comp Sci, Lafia, Nasarawa State, Nigeria
来源
PLOS ONE | 2024年 / 19卷 / 07期
关键词
LANGUAGE;
D O I
10.1371/journal.pone.0305657
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Technological developments over the past few decades have changed the way people communicate, with platforms like social media and blogs becoming vital channels for international conversation. Even though hate speech is vigorously suppressed on social media, it is still a concern that needs to be constantly recognized and observed. The Arabic language poses particular difficulties in the detection of hate speech, despite the considerable efforts made in this area for English-language social media content. Arabic calls for particular consideration when it comes to hate speech detection because of its many dialects and linguistic nuances. Another degree of complication is added by the widespread practice of "code-mixing," in which users merge various languages smoothly. Recognizing this research vacuum, the study aims to close it by examining how well machine learning models containing variation features can detect hate speech, especially when it comes to Arabic tweets featuring code-mixing. Therefore, the objective of this study is to assess and compare the effectiveness of different features and machine learning models for hate speech detection on Arabic hate speech and code-mixing hate speech datasets. To achieve the objectives, the methodology used includes data collection, data pre-processing, feature extraction, the construction of classification models, and the evaluation of the constructed classification models. The findings from the analysis revealed that the TF-IDF feature, when employed with the SGD model, attained the highest accuracy, reaching 98.21%. Subsequently, these results were contrasted with outcomes from three existing studies, and the proposed method outperformed them, underscoring the significance of the proposed method. Consequently, our study carries practical implications and serves as a foundational exploration in the realm of automated hate speech detection in text.
引用
下载
收藏
页数:24
相关论文
共 44 条
  • [1] Detection of hate speech in Arabic tweets using deep learning
    Al-Hassan, Areej
    Al-Dossari, Hmood
    MULTIMEDIA SYSTEMS, 2022, 28 (06) : 1963 - 1974
  • [2] Detection of hate speech in Arabic tweets using deep learning
    Areej Al-Hassan
    Hmood Al-Dossari
    Multimedia Systems, 2022, 28 : 1963 - 1974
  • [3] A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets
    Rehab Duwairi
    Amena Hayajneh
    Muhannad Quwaider
    Arabian Journal for Science and Engineering, 2021, 46 : 4001 - 4014
  • [4] A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets
    Duwairi, Rehab
    Hayajneh, Amena
    Quwaider, Muhannad
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (04) : 4001 - 4014
  • [5] Detecting Hate Speech in Arabic Tweets During COVID-19 Using Machine Learning Approaches
    Alhejaili, Ruba
    Alsaeedi, Abdullah
    Yafooz, Wael M. S.
    PROCEEDINGS OF THIRD DOCTORAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE, DOSCI 2022, 2023, 479 : 467 - 475
  • [6] Detection of Hate Tweets using Machine Learning and Deep Learning
    Ketsbaia, Lida
    Issac, Biju
    Chen, Xiaomin
    2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 751 - 758
  • [7] Hate Speech is not Free Speech: Explainable Machine Learning for Hate Speech Detection in Code-Mixed Languages
    Yadav, Sargam
    Kaushik, Abhishek
    McDaid, Kevin
    2023 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGY AND SOCIETY, ISTAS, 2023,
  • [8] Intelligent detection of hate speech in Arabic social network: A machine learning approach
    Aljarah, Ibrahim
    Habib, Maria
    Hijazi, Neveen
    Faris, Hossam
    Qaddoura, Raneem
    Hammo, Bassam
    Abushariah, Mohammad
    Alfawareh, Mohammad
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (04) : 483 - 501
  • [9] Twitter Hate Speech Detection using Machine Learning
    Janardhan, G.
    Saikiran, Bollu
    Reddy, Inugala Swanith
    Abhishek, Mogilicherla
    2024 4TH INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2024, 2024, : 270 - 278
  • [10] A survey on hate speech detection and sentiment analysis using machine learning and deep learning models
    Subramanian, Malliga
    Sathiskumar, Veerappampalayam Easwaramoorthy
    Deepalakshmi, G.
    Cho, Jaehyuk
    Manikandan, G.
    ALEXANDRIA ENGINEERING JOURNAL, 2023, 80 : 110 - 121