A deep dive into automated sexism detection using fine-tuned deep learning and large language models

被引:0
|
作者
Vetagiri, Advaitha [1 ]
Pakray, Partha [1 ]
Das, Amitava [2 ,3 ]
机构
[1] Natl Inst Technol Silchar, Comp Sci & Engn, Silchar 7 88010, Assam, India
[2] UofSC, Artificial Intelligence Inst, Columbia, SC USA
[3] Wipro AI Lab, Bangalore, Karnataka, India
关键词
Online sexism; Sexism classification; MultiHate dataset; Machine learning; Deep learning; Convolutional Neural Networks-Bidirectional; Long Short-Term Memory; Generative Pre-trained Transformer 2; HATE SPEECH DETECTION; ONLINE;
D O I
10.1016/j.engappai.2025.110167
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The issue of sexism in online content has recently been a significant concern. With the increasing number of online interactions and the rise of social media platforms, the need for automated techniques to identify and classify sexism has become more critical than ever. This paper addresses this problem by fine-tuning deep-learning models for sexism classification using "MultiHate". It is a comprehensive dataset created by curating ten different datasets on sexism. The dataset consists of 1.76 M English texts labelled as sexist and not sexist, then fine-tuned two deep learning models, Convolutional Neural Networks-Bidirectional Long Short-Term Memory and Generative Pre-trained Transformer 2, which accurately detect and classify sexism. A comparative analysis has been conducted on several machine learning and deep learning models using the MultiHate dataset. Investigation reveals that the Generative Pre-trained Transformer 2 model outperforms other models with an accuracy of 92%, while the Convolutional Neural Networks-Bidirectional Long Short-Term Memory model achieved an accuracy of 90% using precision, recall, and F1 scores as performance metrics. The models' performances are promising, indicating that automated techniques can be employed to classify sexist content effectively. A comprehensive error analysis of the models' performance has been presented, highlighting their limitations and challenges. The computational time required for training and testing the models is a significant challenge, especially for larger datasets.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Automated Smart Contract Vulnerability Detection using Fine-tuned Large Language Models
    Yang, Zhiju
    Man, Gaoyuan
    Yue, Songqing
    6TH INTERNATIONAL CONFERENCE ON BLOCKCHAIN TECHNOLOGY AND APPLICATIONS, ICBTA 2023, 2023, : 19 - 23
  • [2] Disaster Tweet Classification Using Fine-Tuned Deep Learning Models Versus Zero and Few-Shot Large Language Models
    Dinani, Soudabeh Taghian
    Caragea, Doina
    Gyawali, Nikesh
    DATA MANAGEMENT TECHNOLOGIES AND APPLICATIONS, DATA 2023, 2024, 2105 : 73 - 94
  • [3] Automated classification of brain MRI reports using fine-tuned large language models
    Kanzawa, Jun
    Yasaka, Koichiro
    Fujita, Nana
    Fujiwara, Shin
    Abe, Osamu
    NEURORADIOLOGY, 2024, 66 (12) : 2177 - 2183
  • [4] An exploratory and automated study of sarcasm detection and classification in app stores using fine-tuned deep learning classifiers
    Fatima, Eman
    Kanwal, Hira
    Khan, Javed Ali
    Khan, Nek Dil
    AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (02)
  • [5] Multimodality Imaging of COVID-19 Using Fine-Tuned Deep Learning Models
    Almuayqil, Saleh
    Abd El-Ghany, Sameh
    Shehab, Abdulaziz
    DIAGNOSTICS, 2023, 13 (07)
  • [6] Multiclass Skin Cancer Classification Using Ensemble of Fine-Tuned Deep Learning Models
    Kausar, Nabeela
    Hameed, Abdul
    Sattar, Mohsin
    Ashraf, Ramiza
    Imran, Ali Shariq
    ul Abidin, Muhammad Zain
    Ali, Ammara
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [7] Fine-tuned deep learning models for early detection and classification of kidney conditions in CT imaging
    Pimpalkar, Amit
    Saini, Dilip Kumar Jang Bahadur
    Shelke, Nilesh
    Balodi, Arun
    Rapate, Gauri
    Tolani, Manoj
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [8] Enhancing Freezing of Gait Detection in Parkinson's Through Fine-Tuned Deep Learning Models
    Tebaldi, Michele
    Pravadelli, Graziano
    Demrozi, Florenc
    Giugno, Rosalba
    Turetta, Cristian
    2024 IEEE INTERNATIONAL CONFERENCE ON DIGITAL HEALTH, ICDH 2024, 2024, : 87 - 94
  • [9] LogFiT: Log Anomaly Detection Using Fine-Tuned Language Models
    Almodovar, Crispin
    Sabrina, Fariza
    Karimi, Sarvnaz
    Azad, Salahuddin
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2024, 21 (02): : 1715 - 1723
  • [10] Fine-Tuned Deep Transfer Learning Models for Large Screenings of Safer Drugs Targeting Class A GPCRs
    Provasi, Davide
    Filizola, Marta
    BIOCHEMISTRY, 2025, 64 (06) : 1328 - 1337