Unveiling Toxic Tendencies of Small Language Models in Unconstrained Generation Tasks

被引:0
|
作者
Chandra, Lakshay [1 ]
Susan, Seba [1 ]
Kumar, Dhruv [1 ]
Kant, Krishan [1 ]
机构
[1] Delhi Technol Univ, Dept Informat Technol, Delhi 110042, India
关键词
toxicity analysis; small language models; language generation; deep learning;
D O I
10.1109/CONECCT62155.2024.10677188
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The prevalence of toxicity online presents a significant challenge for platforms and publishers alike. Recent studies conducted on Small Language Models (SLMs) have identified the inherent toxicity that dwell in these models. In this work, we study and benchmark the extent to which SLMs can be prompted to generate toxic language. The following SLMs are evaluated for their toxicity levels: GPT-2 Large, Gemma-2B, Mistral-7B, Falcon-7B, and Llama 2-13B. We go a step closer to understanding the correlation between toxicity and the intrinsic parameters of the state-of-the-art SLMs. Next, we study the efficacy of a basic word-filtering approach to controlled text generation. Following this, we proceed to establish a mathematical ground for computing the weighted toxicity of continuations with respect to the toxicity of prompts by treating toxicity as a fuzzy metric. Finally, we extend our analysis to examine the unexpected toxicity levels of generated continuations when prompted with non-toxic inputs.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks
    Hakimov, Sherzod
    Schlangen, David
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 14196 - 14210
  • [22] The Rise of Small Language Models
    Zhang, Qin
    Liu, Ziqi
    Pan, Shirui
    IEEE INTELLIGENT SYSTEMS, 2025, 40 (01) : 30 - 37
  • [23] Vision-Language Models for Vision Tasks: A Survey
    Zhang, Jingyi
    Huang, Jiaxing
    Jin, Sheng
    Lu, Shijian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (08) : 5625 - 5644
  • [24] Sources of Hallucination by Large Language Models on Inference Tasks
    McKenna, Nick
    Li, Tianyi
    Cheng, Liang
    Hosseini, Mohammad Javad
    Johnson, Mark
    Steedman, Mark
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2758 - 2774
  • [25] Evaluating large language models in theory of mind tasks
    Kosinski, Michal
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (45)
  • [26] ReplanVLM: Replanning Robotic Tasks With Visual Language Models
    Mei, Aoran
    Zhu, Guo-Niu
    Zhang, Huaxiang
    Gan, Zhongxue
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 10201 - 10208
  • [27] Facilitating Autonomous Driving Tasks With Large Language Models
    Wu, Mengyao
    Yu, F. Richard
    Liu, Peter Xiaoping
    He, Ying
    IEEE INTELLIGENT SYSTEMS, 2025, 40 (01) : 45 - 52
  • [28] Beyond the hype: Unveiling the challenges of large language models in urology
    Kwong, Jethro C. C.
    Nguyen, David-Dan
    Khondker, Adree
    Li, Tiange
    CUAJ-CANADIAN UROLOGICAL ASSOCIATION JOURNAL, 2024, 18 (10): : 333 - 334
  • [29] Unveiling the power of language models in chemical research question answering
    Chen, Xiuying
    Wang, Tairan
    Guo, Taicheng
    Guo, Kehan
    Zhou, Juexiao
    Li, Haoyang
    Song, Zirui
    Gao, Xin
    Zhang, Xiangliang
    COMMUNICATIONS CHEMISTRY, 2025, 8 (01):
  • [30] The Earth is Flat? Unveiling Factual Errors in Large Language Models
    The Chinese University of Hong Kong, Hong Kong
    不详
    不详
    arXiv, 1600,