Understanding hate speech: the HateInsights dataset and model interpretability

被引：0

作者：

Arshad, Muhammad Umair ^{[1
]}

Shahzad, Waseem ^{[1
]}

机构：

[1] Natl Univ Comp & Emerging Sci, Dept Artificial Intelligence & Data Sci, Islamabad, Pakistan

来源：

PEERJ COMPUTER SCIENCE | 2024年 / 10卷

关键词：

Explainable AI; Hate speech; LLM; AI; Machine learning; Natural language processing; LANGUAGE;

D O I：

10.7717/peerj-cs.2372

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The persistence of hate speech continues to pose an obstacle in the realm of online social media. Despite the continuous evolution of advanced models for identifying hate speech, the critical dimensions of interpretability and explainability have not received proportional scholarly attention. In this article, we introduce the HateInsights dataset, a groundbreaking benchmark in the fi eld of hate speech datasets, encompassing diverse aspects of this widespread issue. Within our dataset, each individual post undergoes thorough annotation from dual perspectives: fi rstly, conforming to the established 3-class classification fi cation paradigm that includes hate speech, offensive language, and normal discourse; secondly, incorporating rationales that outline specific fi c segments of a post supporting the assigned label (categorized as hate speech, offensive language, or normal discourse). Our exploration yields a significant fi cant fi nding by harnessing cutting-edge state-of-the-art models: even models demonstrating exceptional proficiency fi ciency in classification fi cation tasks yield suboptimal outcomes in crucial explainability metrics, such as model plausibility and faithfulness. Furthermore, our analysis underscores a promising revelation concerning models trained using human-annotated rationales. To facilitate scholarly progress in this realm, we have made both our dataset and codebase accessible to fellow researchers. This initiative aims to encourage collaborative involvement and inspire the advancement of the hate speech detection approach characterized by increased transparency, clarity, and fairness.

引用

页数：22

共 50 条

[21] ETHOS: a multi-label hate speech detection dataset
Ioannis Mollas
Zoe Chrysopoulou
Stamatis Karlos
Grigorios Tsoumakas
Complex & Intelligent Systems, 2022, 8 : 4663 - 4678
[22] T-HSAB: A Tunisian Hate Speech and Abusive Dataset
Haddad, Hatem
Mulki, Hala
Oueslati, Asma
ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 251 - 263
[23] Terrorist-Extremist Speech and Hate Speech: Understanding the Similarities and Differences
Gelber, Katharine
ETHICAL THEORY AND MORAL PRACTICE, 2019, 22 (03) : 607 - 622
[24] Terrorist-Extremist Speech and Hate Speech: Understanding the Similarities and Differences
Katharine Gelber
Ethical Theory and Moral Practice, 2019, 22 : 607 - 622
[25] Enhancing Hate Speech Detection in the Digital Age: A Novel Model Fusion Approach Leveraging a Comprehensive Dataset
Sharif, Waqas
Abdullah, Saima
Iftikhar, Saman
Al-Madani, Daniah
Mumtaz, Shahzad
IEEE ACCESS, 2024, 12 : 27225 - 27236
[26] Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model
Saleh, Hind
Alhothali, Areej
Moria, Kawthar
APPLIED ARTIFICIAL INTELLIGENCE, 2023, 37 (01)
[27] Understanding emotions in hate speech: A methodology for discourse analysis
Alcantara-Pla, Manuel
DISCOURSE & SOCIETY, 2024, 35 (04) : 417 - 433
[28] Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
ElSherief, Mai
Ziems, Caleb
Muchlinski, David
Anupindi, Vaishnavi
Seybolt, Jordyn
De Choudhury, Munmun
Yang, Diyi
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 345 - 363
[29] TABHATE: A Target-based hate speech detection dataset in Hindi
Sharma, Deepawali
Singh, Vivek Kumar
Gupta, Vedika
SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
[30] Anatomy of Hate Speech Datasets: Composition Analysis and Cross-dataset Classification
Guimaraes, Samuel
Kakizaki, Gabriel
Melo, Philipe
Silva, Marcio
Murai, Fabricio
Reis, Julio C. S.
Benevenuto, Fabricio
34TH ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA, HT 2023, 2023,

← 1 2 3 4 5 →