Data Augmentation for Improving Explainability of Hate Speech Detection

被引:0
|
作者
Ansari, Gunjan [1 ]
Kaur, Parmeet [2 ]
Saxena, Chandni [3 ]
机构
[1] JSS Acad Tech Educ, Dept Informat Technol, Noida, India
[2] Jaypee Inst Informat Technol, Dept Comp Sci & Informat Technol, Noida, India
[3] Chinese Univ Hong Kong, SAR, Hong Kong, Peoples R China
关键词
Hate speech; Cyberbullying; Explainable AI; Data augmentation; LIME; Integrated gradient;
D O I
10.1007/s13369-023-08100-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The paper presents a novel data augmentation-based approach to develop explainable, deep learning models for hate speech detection. Hate speech is widely prevalent on online social media but difficult to detect automatically due to challenges of natural language processing and complexity of hate speech. Further, the decisions of the existing solutions possess constrained explainability since limited annotated data are available for training and testing of models. Therefore, this work proposes the use of text-based data augmentation for improving the performance and explainability of deep learning models. Techniques based on easy data augmentation, bidirectional encoder representations from transformers and back translation have been utilized for data augmentation. Convolutional neural networks and long short-term memory models are trained with augmented data and evaluated on two publicly available datasets for hate speech detection. Methods of LIME and integrated gradients are used to retrieve explanations of the deep learning models. A diagnostic study is conducted on test samples to check for improvement in the models as a result of the data augmentation. The experimental results verify that the proposed approach improves the explainability as well as the accuracy of hate speech detection.
引用
收藏
页码:3609 / 3621
页数:13
相关论文
共 50 条
  • [21] A comprehensive review on detection of hate speech for multi-lingual data
    Rachna Narula
    Poonam Chaudhary
    Social Network Analysis and Mining, 14 (1)
  • [22] Multilingual Hate Speech Detection: Innovations in Optimized Deep Learning for English and Arabic Hate Speech Detection
    Hassan AL-Sukhani
    Qusay Bsoul
    Abdelrahman H. Elhawary
    Ziad M. Nasr
    Ahmed E. Mansour
    Radwan M. Batyha
    Basma S. Alqadi
    Jehad Saad Alqurni
    Hayat Alfagham
    Magda M. Madbouly
    SN Computer Science, 6 (3)
  • [23] Data expansion using back translation and paraphrasing for hate speech detection
    Beddiar D.R.
    Jahan M.S.
    Oussalah M.
    Online Social Networks and Media, 2021, 24
  • [24] On Online Hate Speech Detection. Effects of Negated Data Construction
    Abderrouaf, Cheniki
    Oussalah, Mourad
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 5595 - 5602
  • [25] Bias Detection and Mitigation in Textual Data: A Study on Fake News and Hate Speech Detection
    Kasampalis, Apostolos
    Chatzakou, Despoina
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 374 - 383
  • [26] Improving Intrusion Detection Through Training Data Augmentation
    Otokwala, Uneneibotejit
    Petrovski, Andrei
    Kalutarage, Harsha
    2021 14TH INTERNATIONAL CONFERENCE ON SECURITY OF INFORMATION AND NETWORKS (SIN 2021), 2021,
  • [27] Improving Low Resource Turkish Speech Recognition with Data Augmentation and TTS
    Gokay, Ramazan
    Yalcin, Hulya
    2019 16TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2019, : 357 - 360
  • [28] Towards improving automatic speech recognition for underrepresented dialects with data augmentation
    Bakst, Sarah
    Yilmaz, Emre
    Castan, Diego
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
  • [29] Improving speech recognition using data augmentation and acoustic model fusion
    Rebai, Ilyes
    BenAyed, Yessine
    Mahdi, Walid
    Lorre, Jean-Pierre
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS, 2017, 112 : 316 - 322
  • [30] Significance of Data Augmentation for Improving Cleft Lip and Palate Speech Recognition
    Sudro, Protima Nomo
    Das, Rohan Kumar
    Sinha, Rohit
    Prasanna, S. R. Mahadeva
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 484 - 490