Systematic keyword and bias analyses in hate speech detection

被引:1
|
作者
Sarracen, Gretel Liz De la Pella [1 ]
Rosso, Paolo [1 ]
机构
[1] Univ Politecn Valencia, Camino Vera S-N, Valencia 46022, Spain
关键词
Hate speech detection; Keyword extraction; Bias analysis; Bias mitigation;
D O I
10.1016/j.ipm.2023.103433
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hate speech detection refers broadly to the automatic identification of language that may be considered discriminatory against certain groups of people. The goal is to help online platforms to identify and remove harmful content. Humans are usually capable of detecting hatred in critical cases, such as when the hatred is non-explicit, but how do computer models address this situation? In this work, we aim to contribute to the understanding of ethical issues related to hate speech by analysing two transformer-based models trained to detect hate speech. Our study focuses on analysing the relationship between these models and a set of hateful keywords extracted from the three well-known datasets. For the extraction of the keywords, we propose a metric that takes into account the division among classes to favour the most common words in hateful contexts. In our experiments, we first compared the overlap between the extracted keywords with the words to which the models pay the most attention in decision-making. On the other hand, we investigate the bias of the models towards the extracted keywords. For the bias analysis, we characterize and use two metrics and evaluate two strategies to try to mitigate the bias. Surprisingly, we show that over 50% of the salient words of the models are not hateful and that there is a higher number of hateful words among the extracted keywords. However, we show that the models appear to be biased towards the extracted keywords. Experimental results suggest that fitting models with hateful texts that do not contain any of the keywords can reduce bias and improve the performance of the models.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Constructing ensembles for hate speech detection
    Kucukkaya, Izzet Emre
    Toraman, Cagri
    [J]. NATURAL LANGUAGE PROCESSING, 2024,
  • [32] Levantine hate speech detection in twitter
    Medyan AbdelHamid
    Assef Jafar
    Yasser Rahal
    [J]. Social Network Analysis and Mining, 2022, 12
  • [33] Topic Oriented Hate Speech Detection
    Jamil, Raihan
    Khan, Mohammad Abdullah Al Nayeem
    Anwar, Md Musfique
    [J]. HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 365 - 375
  • [34] Hate speech detection: Challenges and solutions
    MacAvaney, Sean
    Yao, Hao-Ren
    Yang, Eugene
    Russell, Katina
    Goharian, Nazli
    Frieder, Ophir
    [J]. PLOS ONE, 2019, 14 (08):
  • [35] Hate Speech Detection with Comment Embeddings
    Djuric, Nemanja
    Zhou, Jing
    Morris, Robin
    Grbovic, Mihajlo
    Radosavljevic, Vladan
    Bhamidipati, Narayan
    [J]. WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 29 - 30
  • [36] Unintended bias evaluation: An analysis of hate speech detection and gender bias mitigation on social media using ensemble learning
    Nascimento, Francimaria R. S.
    Cavalcanti, George D. C.
    Da Costa-Abreu, Marjory
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 201
  • [37] Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model
    Saleh, Hind
    Alhothali, Areej
    Moria, Kawthar
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2023, 37 (01)
  • [38] Levantine hate speech detection in twitter
    AbdelHamid, Medyan
    Jafar, Assef
    Rahal, Yasser
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [39] Hate Speech Detection in Roman Urdu
    Khan, Muhammad Moin
    Shahzad, Khurram
    Malik, Muhammad Kamran
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (01)
  • [40] A Federated Approach for Hate Speech Detection
    Gala, Jay
    Gandhi, Deep
    Mehta, Jash
    Talat, Zeerak
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3248 - 3259