Systematic keyword and bias analyses in hate speech detection

被引:1
|
作者
Sarracen, Gretel Liz De la Pella [1 ]
Rosso, Paolo [1 ]
机构
[1] Univ Politecn Valencia, Camino Vera S-N, Valencia 46022, Spain
关键词
Hate speech detection; Keyword extraction; Bias analysis; Bias mitigation;
D O I
10.1016/j.ipm.2023.103433
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hate speech detection refers broadly to the automatic identification of language that may be considered discriminatory against certain groups of people. The goal is to help online platforms to identify and remove harmful content. Humans are usually capable of detecting hatred in critical cases, such as when the hatred is non-explicit, but how do computer models address this situation? In this work, we aim to contribute to the understanding of ethical issues related to hate speech by analysing two transformer-based models trained to detect hate speech. Our study focuses on analysing the relationship between these models and a set of hateful keywords extracted from the three well-known datasets. For the extraction of the keywords, we propose a metric that takes into account the division among classes to favour the most common words in hateful contexts. In our experiments, we first compared the overlap between the extracted keywords with the words to which the models pay the most attention in decision-making. On the other hand, we investigate the bias of the models towards the extracted keywords. For the bias analysis, we characterize and use two metrics and evaluate two strategies to try to mitigate the bias. Surprisingly, we show that over 50% of the salient words of the models are not hateful and that there is a higher number of hateful words among the extracted keywords. However, we show that the models appear to be biased towards the extracted keywords. Experimental results suggest that fitting models with hateful texts that do not contain any of the keywords can reduce bias and improve the performance of the models.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Twitter Hate Speech Detection: A Systematic Review of Methods, Taxonomy Analysis, Challenges, and Opportunities
    Mansur, Zainab
    Omar, Nazlia
    Tiun, Sabrina
    [J]. IEEE ACCESS, 2023, 11 : 16226 - 16249
  • [42] Hate Speech is not Free Speech: Explainable Machine Learning for Hate Speech Detection in Code-Mixed Languages
    Yadav, Sargam
    Kaushik, Abhishek
    McDaid, Kevin
    [J]. 2023 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGY AND SOCIETY, ISTAS, 2023,
  • [43] A valid question: Could hate speech condition bias in the brain?
    Murrow, Gail B.
    Murrow, Richard
    [J]. JOURNAL OF LAW AND THE BIOSCIENCES, 2016, 3 (01): : 196 - 201
  • [44] A survey of hate speech detection in Indian languages
    Arpan Nandi
    Kamal Sarkar
    Arjun Mallick
    Arkadeep De
    [J]. Social Network Analysis and Mining, 14
  • [45] Hate Speech Detection Using Brazilian Imageboards
    Nascimento, Gabriel
    Carvalho, Flavio
    da Cunha, Alexandre Martins
    Viana, Carlos Roberto
    Guedes, Gustavo Paiva
    [J]. WEBMEDIA 2019: PROCEEDINGS OF THE 25TH BRAZILLIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2019, : 325 - 328
  • [46] A comparison of classification algorithms for hate speech detection
    Putri, T. T. A.
    Sriadhi, S.
    Sari, R. D.
    Rahmadani, R.
    Hutahaean, H. D.
    [J]. INTERNATIONAL CONFERENCE ON INNOVATION IN ENGINEERING AND VOCATIONAL EDUCATION 2019 (ICIEVE 2019), PTS 1-4, 2020, 830
  • [47] Is hate speech detection the solution the world wants?
    Parker, Sara
    Ruths, Derek
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (10)
  • [48] A survey of hate speech detection in Indian languages
    Nandi, Arpan
    Sarkar, Kamal
    Mallick, Arjun
    De, Arkadeep
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [49] A Turkish Hate Speech Dataset and Detection System
    Beyhan, Fatih
    Carik, Buse
    Arin, Inanc
    Terzioglu, Aysecan
    Yanikoglu, Berrin
    Yeniterzi, Reyyan
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4177 - 4185
  • [50] Deep Learning Ensembles for Hate Speech Detection
    Alsafari, Safa
    Sadaoui, Samira
    Mouhoub, Malek
    [J]. 2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 526 - 531