Data Augmentation for Improving Explainability of Hate Speech Detection

被引:0
|
作者
Ansari, Gunjan [1 ]
Kaur, Parmeet [2 ]
Saxena, Chandni [3 ]
机构
[1] JSS Acad Tech Educ, Dept Informat Technol, Noida, India
[2] Jaypee Inst Informat Technol, Dept Comp Sci & Informat Technol, Noida, India
[3] Chinese Univ Hong Kong, SAR, Hong Kong, Peoples R China
关键词
Hate speech; Cyberbullying; Explainable AI; Data augmentation; LIME; Integrated gradient;
D O I
10.1007/s13369-023-08100-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The paper presents a novel data augmentation-based approach to develop explainable, deep learning models for hate speech detection. Hate speech is widely prevalent on online social media but difficult to detect automatically due to challenges of natural language processing and complexity of hate speech. Further, the decisions of the existing solutions possess constrained explainability since limited annotated data are available for training and testing of models. Therefore, this work proposes the use of text-based data augmentation for improving the performance and explainability of deep learning models. Techniques based on easy data augmentation, bidirectional encoder representations from transformers and back translation have been utilized for data augmentation. Convolutional neural networks and long short-term memory models are trained with augmented data and evaluated on two publicly available datasets for hate speech detection. Methods of LIME and integrated gradients are used to retrieve explanations of the deep learning models. A diagnostic study is conducted on test samples to check for improvement in the models as a result of the data augmentation. The experimental results verify that the proposed approach improves the explainability as well as the accuracy of hate speech detection.
引用
收藏
页码:3609 / 3621
页数:13
相关论文
共 50 条
  • [41] Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model
    Saleh, Hind
    Alhothali, Areej
    Moria, Kawthar
    APPLIED ARTIFICIAL INTELLIGENCE, 2023, 37 (01)
  • [42] Levantine hate speech detection in twitter
    AbdelHamid, Medyan
    Jafar, Assef
    Rahal, Yasser
    SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [43] Hate Speech Detection in Roman Urdu
    Khan, Muhammad Moin
    Shahzad, Khurram
    Malik, Muhammad Kamran
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (01)
  • [44] A Federated Approach for Hate Speech Detection
    Gala, Jay
    Gandhi, Deep
    Mehta, Jash
    Talat, Zeerak
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3248 - 3259
  • [45] Mechanisms of Improving Institutional Capacities of the State to Prevent Hate Speech and Hate Crimes
    Dokmanovic, Mirjana
    TEMIDA, 2014, 17 (02) : 3 - 26
  • [46] Augment to Prevent: Short-Text Data Augmentation in Deep Learning for Hate-Speech Classification
    Rizos, Georgios
    Hemker, Konstantin
    Schuller, Bjoern
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 991 - 1000
  • [47] Data-Driven and Psycholinguistics-Motivated Approaches to Hate Speech Detection
    Silva, Samuel Caetano
    Ferreira, Thiago Castro
    Silva Ramos, Ricelli Moreira
    Paraboni, Ivandre
    COMPUTACION Y SISTEMAS, 2020, 24 (03): : 1179 - 1188
  • [48] Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection
    Sen, Indira
    Samory, Mattia
    Wagner, Claudia
    Augenstein, Isabelle
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4716 - 4726
  • [49] NAIJAHATE: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
    Tonneau, Manuel
    de Castro, Pedro Vitor Quinta
    Lasri, Karim
    Farouq, Ibrahim
    Subramanian, Lakshminarayanan
    Orozco-Olvera, Victor
    Fraiberger, Samuel P.
    arXiv,
  • [50] Hate Speech is not Free Speech: Explainable Machine Learning for Hate Speech Detection in Code-Mixed Languages
    Yadav, Sargam
    Kaushik, Abhishek
    McDaid, Kevin
    2023 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGY AND SOCIETY, ISTAS, 2023,