A deep dive into automated sexism detection using fine-tuned deep learning and large language models

被引:0
|
作者
Vetagiri, Advaitha [1 ]
Pakray, Partha [1 ]
Das, Amitava [2 ,3 ]
机构
[1] Natl Inst Technol Silchar, Comp Sci & Engn, Silchar 7 88010, Assam, India
[2] UofSC, Artificial Intelligence Inst, Columbia, SC USA
[3] Wipro AI Lab, Bangalore, Karnataka, India
关键词
Online sexism; Sexism classification; MultiHate dataset; Machine learning; Deep learning; Convolutional Neural Networks-Bidirectional; Long Short-Term Memory; Generative Pre-trained Transformer 2; HATE SPEECH DETECTION; ONLINE;
D O I
10.1016/j.engappai.2025.110167
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The issue of sexism in online content has recently been a significant concern. With the increasing number of online interactions and the rise of social media platforms, the need for automated techniques to identify and classify sexism has become more critical than ever. This paper addresses this problem by fine-tuning deep-learning models for sexism classification using "MultiHate". It is a comprehensive dataset created by curating ten different datasets on sexism. The dataset consists of 1.76 M English texts labelled as sexist and not sexist, then fine-tuned two deep learning models, Convolutional Neural Networks-Bidirectional Long Short-Term Memory and Generative Pre-trained Transformer 2, which accurately detect and classify sexism. A comparative analysis has been conducted on several machine learning and deep learning models using the MultiHate dataset. Investigation reveals that the Generative Pre-trained Transformer 2 model outperforms other models with an accuracy of 92%, while the Convolutional Neural Networks-Bidirectional Long Short-Term Memory model achieved an accuracy of 90% using precision, recall, and F1 scores as performance metrics. The models' performances are promising, indicating that automated techniques can be employed to classify sexist content effectively. A comprehensive error analysis of the models' performance has been presented, highlighting their limitations and challenges. The computational time required for training and testing the models is a significant challenge, especially for larger datasets.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Deep Learning for Pneumothorax Detection Using Networks Fine-Tuned with Chest Radiographs From Institutional and Publically Available Datasets
    Crosby, J.
    Rhines, T.
    Li, F.
    MacMahon, H.
    Giger, M.
    MEDICAL PHYSICS, 2019, 46 (06) : E339 - E339
  • [32] Leveraging fine-tuned Large Language Models with LoRA for Effective Claim, Claimer, and Claim Object Detection
    Kotitsas, Sotiris
    Kounoudis, Panagiotis
    Koutli, Eleni
    Papageorgiou, Haris
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2540 - 2554
  • [33] On Fine-Tuned Deep Features for Unsupervised Domain Adaptation
    Wang, Qian
    Meng, Fanlin
    Breckon, Toby P.
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [34] DEEP LINE ENGINEERING FINE-TUNED BY TRANSMED.
    Benedini, Giorgio
    Berti, Alfredo
    Pipeline and Gas Journal, 1982, 210 (04): : 46 - 53
  • [35] NDLP Phishing: A Fine-Tuned Application to Detect Phishing Attacks Based on Natural Language Processing and Deep Learning
    Benavides-Astudillo E.
    Fuertes W.
    Sanchez-Gordon S.
    Nuñez-Agurto D.
    International Journal of Interactive Mobile Technologies, 2024, 18 (10): : 173 - 190
  • [36] Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models
    Foley, Myles
    Rawat, Ambrish
    Lee, Taesung
    Hou, Yufang
    Picco, Gabriele
    Zizzo, Giulio
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7423 - 7442
  • [37] An improved cyber-attack detection and classification model for the internet of things systems using fine-tuned deep learning model
    Leni, A. Ezil Sam
    Anand, R.
    Mythili, N.
    Pugalenthi, R.
    INTERNATIONAL JOURNAL OF SENSOR NETWORKS, 2025, 47 (01)
  • [38] Differential Privacy to Mathematically Secure Fine-Tuned Large Language Models for Linguistic Steganography
    Coffey, Sean M.
    Catudal, Joseph W.
    Bastian, Nathaniel D.
    ASSURANCE AND SECURITY FOR AI-ENABLED SYSTEMS, 2024, 13054
  • [39] Cotton Leaf Disease Classification Using Fine-tuned VGG16 Deep Learning Model
    Kaur, Arshleen
    Sharma, Rishabh
    Chattopadhyay, Saumitra
    Joshi, Kireet
    2024 2ND WORLD CONFERENCE ON COMMUNICATION & COMPUTING, WCONF 2024, 2024,
  • [40] Online aggression detection using ensemble techniques on fine-tuned transformer-based language models
    Chinivar S.
    Roopa M.S.
    Arunalatha J.S.
    Venugopal K.R.
    International Journal of Computers and Applications, 2024, 46 (08) : 567 - 579