Intelligent detection of hate speech in Arabic social network: A machine learning approach

被引:45
|
作者
Aljarah, Ibrahim [1 ]
Habib, Maria [1 ]
Hijazi, Neveen [1 ]
Faris, Hossam [1 ]
Qaddoura, Raneem [2 ]
Hammo, Bassam [1 ]
Abushariah, Mohammad [1 ]
Alfawareh, Mohammad [1 ]
机构
[1] Univ Jordan, Queen Rania Str, Amman 19328, Jordan
[2] Philadelphia Univ, Amman, Jordan
关键词
Hate speech; machine learning; text vectorization; Twitter; SENTIMENT ANALYSIS; TWITTER;
D O I
10.1177/0165551520917651
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, cyber hate speech is increasingly growing, which forms a serious problem worldwide by threatening the cohesion of civil societies. Hate speech relates to using expressions or phrases that are violent, offensive or insulting for a person or a minority of people. In particular, in the Arab region, the number of Arab social media users is growing rapidly, which is accompanied with high increasing rate of cyber hate speech. This drew our attention to aspire healthy online environments that are free of hatred and discrimination. Therefore, this article aims to detect cyber hate speech based on Arabic context over Twitter platform, by applying Natural Language Processing (NLP) techniques, and machine learning methods. The article considers a set of tweets related to racism, journalism, sports orientation, terrorism and Islam. Several types of features and emotions are extracted and arranged in 15 different combinations of data. The processed dataset is experimented using Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree (DT) and Random Forest (RF), in which RF with the feature set of Term Frequency-Inverse Document Frequency (TF-IDF) and profile-related features achieves the best results. Furthermore, a feature importance analysis is conducted based on RF classifier in order to quantify the predictive ability of features in regard to the hate class.
引用
收藏
页码:483 / 501
页数:19
相关论文
共 50 条
  • [1] Hate and offensive speech detection on Arabic social media
    Alsafari S.
    Sadaoui S.
    Mouhoub M.
    [J]. Online Social Networks and Media, 2020, 19
  • [2] Machine Learning Approach for the Detection of Hate Speech in Sinhala Unicode Text
    Samarasinghe, S. W. A. M. D.
    Meegama, R. G. N.
    Punchimudiyanse, M.
    [J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 65 - 70
  • [3] An efficient approach for data-imbalanced hate speech detection in Arabic social media
    Mohamed, Mohamed S.
    Elzayady, Hossam
    Badran, Khaled M.
    Salama, Gouda I.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (04) : 6381 - 6390
  • [4] Sinhala Hate Speech Detection in Social Media Using Machine Learning and Deep Learning
    Fernando, W. S. S.
    Weerasinghe, Ruvan
    Bandara, E. R. A. D.
    [J]. 2022 22ND INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER), 2022,
  • [5] Hate Speech Detection in Social Networks using Machine Learning and Deep Learning Methods
    Toktarova, Aigerim
    Syrlybay, Dariga
    Myrzakhmetova, Bayan
    Anuarbekova, Gulzat
    Rakhimbayeva, Gulbarshin
    Zhylanbaeva, Balkiya
    Suieuova, Nabat
    Kerimbekov, Mukhtar
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (05) : 396 - 406
  • [6] Detection of hate speech in Arabic tweets using deep learning
    Al-Hassan, Areej
    Al-Dossari, Hmood
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (06) : 1963 - 1974
  • [7] Detection of hate speech in Arabic tweets using deep learning
    Areej Al-Hassan
    Hmood Al-Dossari
    [J]. Multimedia Systems, 2022, 28 : 1963 - 1974
  • [8] A comparative analysis of machine learning algorithms for hate speech detection in social media
    Omran, Esraa
    Al Tararwah, Estabraq
    Al Qundus, Jamal
    [J]. ONLINE JOURNAL OF COMMUNICATION AND MEDIA TECHNOLOGIES, 2023, 13 (04):
  • [9] Advances in Machine Learning Algorithms for Hate Speech Detection in Social Media: A Review
    Mullah, Nanlir Sallau
    Zainon, Wan Mohd Nazmee Wan
    [J]. IEEE ACCESS, 2021, 9 : 88364 - 88376
  • [10] Twitter Hate Speech Detection using Machine Learning
    Janardhan, G.
    Saikiran, Bollu
    Reddy, Inugala Swanith
    Abhishek, Mogilicherla
    [J]. 2024 4TH INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2024, 2024, : 270 - 278