Improving Hate Speech Detection Accuracy using Hybrid CNN-RNN and Random Oversampling Techniques

被引:0
|
作者
Riyadi, Slamet [1 ]
Andriyani, Annisa Divayu [1 ]
Masyhur, Ahmad Musthafa [1 ]
机构
[1] Univ Muhammadiyah Yogyakarta, Dept Informat Technol, Yogyakarta, Indonesia
关键词
hate speech; Twitter; hybrid CNN-RNN; balancing dataset; oversampling;
D O I
10.1109/ISIEA61920.2024.10607232
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Detecting hate speech is crucial for addressing online toxicity and fostering a secure digital environment. This study aims to enhance the efficiency of hybrid CNN-RNN models, commonly used for this task, by improving accuracy. By integrating oversampling techniques with the model, the research aims to better categorize instances of hate speech, particularly in imbalanced datasets. The dataset used in this study is the Indonesian Tweet Hate Speech dataset. Following established protocols, including data pre-processing, training, and testing, significant improvements in accuracy are observed. The hybrid CNN-RNN achieves 0.827 accuracy, 0.797 precision, 0.759 recall, and 0.883 F1 score with imbalanced data. The model performs even better with balanced data, reaching 0.908 accuracy, 0.943 precision, 0.894 recall, and 0.914 F1 score. Notably, the proposed model outperforms the standard hybrid CNN-RNN on imbalanced datasets, with an accuracy of 0.752, precision of 0.797, recall of 0.559, and F1 score of 0.657. Techniques like dropout and early termination mitigate overfitting in complex models and large datasets. This research contributes to hate speech detection methods, underscoring the hybrid CNN-RNN's efficacy in handling imbalanced data, while future studies could explore additional methodologies for further enhancements.
引用
下载
收藏
页数:5
相关论文
共 47 条
  • [31] Improving detection accuracy of politically motivated cyber-hate using heterogeneous stacked ensemble (HSE) approach
    Nanlir Sallau Mullah
    Wan Mohd Nazmee Wan Zainon
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 12179 - 12190
  • [32] Improving detection accuracy of politically motivated cyber-hate using heterogeneous stacked ensemble (HSE) approach
    Mullah, Nanlir Sallau
    Zainon, Wan Mohd Nazmee Wan
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 14 (9) : 12179 - 12190
  • [33] Hate speech detection in Twitter using hybrid embeddings and improved cuckoo search-based neural networks
    Ayo, Femi Emmanuel
    Folorunso, Olusegun
    Ibharalu, Friday Thomas
    Osinuga, Idowu Ademola
    INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2020, 13 (04) : 485 - 525
  • [34] Improving accuracy and efficiency in seagrass detection using state-of-the-art AI techniques
    Noman, Md Kislu
    Islam, Syed Mohammed Shamsul
    Abu-Khalaf, Jumana
    Jalali, Seyed Mohammad Jafar
    Lavery, Paul
    ECOLOGICAL INFORMATICS, 2023, 76
  • [35] Improving accuracy of code smells detection using machine learning with data balancing techniques
    Khleel, Nasraldeen Alnor Adam
    Nehez, Karoly
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (14): : 21048 - 21093
  • [36] Improving data classification accuracy in sensor networks using hybrid outlier detection in HAR
    Gopalakrishnan, Nivetha
    Krishnan, Venkatalakshmi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (01) : 771 - 782
  • [37] Segmentation and Detection of Road Region in Aerial Images using Hybrid CNN-Random Field Algorithm
    Sukanya
    Dubey, Gaurav
    PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 502 - 506
  • [38] Automatic Hate Speech Detection in English-Odia Code Mixed Social Media Data Using Machine Learning Techniques
    Mohapatra, Sudhir Kumar
    Prasad, Srinivas
    Bebarta, Dwiti Krishna
    Das, Tapan Kumar
    Srinivasan, Kathiravan
    Hu, Yuh-Chung
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [39] Improving accuracy of cavitation severity detection in centrifugal pumps using a hybrid feature selection technique
    Azizi, Raziyeh
    Attaran, Behrooz
    Hajnayeb, Ali
    Ghanbarzadeh, Afshin
    Changizian, Maziar
    MEASUREMENT, 2017, 108 : 9 - 17
  • [40] A Blackboard Based Hybrid Multi-Agent System for Improving Classification Accuracy Using Reinforcement Learning Techniques
    Kokorakis, Vasileios Manousakis
    Petridis, Miltos
    Kapetanakis, Stelios
    ARTIFICIAL INTELLIGENCE XXXIV, AI 2017, 2017, 10630 : 47 - 57