Improving Hate Speech Detection Accuracy using Hybrid CNN-RNN and Random Oversampling Techniques

被引：0

作者：

Riyadi, Slamet ^{[1
]}

Andriyani, Annisa Divayu ^{[1
]}

Masyhur, Ahmad Musthafa ^{[1
]}

机构：

[1] Univ Muhammadiyah Yogyakarta, Dept Informat Technol, Yogyakarta, Indonesia

来源：

2024 IEEE SYMPOSIUM ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, ISIEA 2024 | 2024年

关键词：

hate speech; Twitter; hybrid CNN-RNN; balancing dataset; oversampling;

D O I：

10.1109/ISIEA61920.2024.10607232

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Detecting hate speech is crucial for addressing online toxicity and fostering a secure digital environment. This study aims to enhance the efficiency of hybrid CNN-RNN models, commonly used for this task, by improving accuracy. By integrating oversampling techniques with the model, the research aims to better categorize instances of hate speech, particularly in imbalanced datasets. The dataset used in this study is the Indonesian Tweet Hate Speech dataset. Following established protocols, including data pre-processing, training, and testing, significant improvements in accuracy are observed. The hybrid CNN-RNN achieves 0.827 accuracy, 0.797 precision, 0.759 recall, and 0.883 F1 score with imbalanced data. The model performs even better with balanced data, reaching 0.908 accuracy, 0.943 precision, 0.894 recall, and 0.914 F1 score. Notably, the proposed model outperforms the standard hybrid CNN-RNN on imbalanced datasets, with an accuracy of 0.752, precision of 0.797, recall of 0.559, and F1 score of 0.657. Techniques like dropout and early termination mitigate overfitting in complex models and large datasets. This research contributes to hate speech detection methods, underscoring the hybrid CNN-RNN's efficacy in handling imbalanced data, while future studies could explore additional methodologies for further enhancements.

引用

下载

页数：5

共 47 条

[1] Improving Hate Speech Detection Using Double-Layers Hybrid CNN-RNN Model on Imbalanced Dataset
Riyadi, Slamet
Divayu Andriyani, Annisa
Noraini Sulaiman, Siti
IEEE Access, 2024, 12 : 159660 - 159668
[2] Improving CNN-RNN Hybrid Networks for Handwriting Recognition
Dutta, Kartik
Krishnan, Praveen
Mathew, Minesh
Jawahar, C. V.
PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 80 - 85
[3] Unconstrained OCR for Urdu using Deep CNN-RNN Hybrid Networks
Jain, Mohit
Mathew, Minesh
Jawahar, C. V.
PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 747 - 752
[4] Detection of Deepfake Media Using a Hybrid CNN-RNN Model and Particle Swarm Optimization (PSO) Algorithm
Al-Adwan, Aryaf
Alazzam, Hadeel
Al-Anbaki, Noor
Alduweib, Eman
COMPUTERS, 2024, 13 (04)
[5] Detection of fake news using deep learning CNN-RNN based methods
Sastrawan, I. Kadek
Bayupati, I. P. A.
Arsa, Dewa Made Sri
ICT EXPRESS, 2022, 8 (03): : 396 - 408
[6] Predicting Beijing Air Quality Using Bayesian Optimized CNN-RNN Hybrid Model
Tu, Zihan
Wu, Zhe
2022 ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING (CACML 2022), 2022, : 581 - 587
[7] Surgical Tool Segmentation Using A Hybrid Deep CNN-RNN Auto Encoder-Decoder
Attia, Mohamed
Hossny, Mohammed
Nahavandi, Saeid
Asadi, Hamed
2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 3373 - 3378
[8] COVID-19 Detection using Hybrid CNN-RNN Architecture with Transfer Learning from X-Rays
Deshwal D.
Sangwan P.
Dahiya N.
Lilhore U.K.
Dalal S.
Simaiya S.
Current Medical Imaging, 2024, 20
[9] Voice Pathology Detection Using a Two-Level Classifier Based on Combined CNN-RNN Architecture
Ksibi, Amel
Hakami, Nada Ali
Alturki, Nazik
Asiri, Mashael M. M.
Zakariah, Mohammed
Ayadi, Manel
SUSTAINABILITY, 2023, 15 (04)
[10] Improving Hate Speech Detection: The Impact of Semantic Representations and Preprocessing Techniques
Bolucu, Necva
Ozerdem, Aysegul
2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,

← 1 2 3 4 5 →