Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation

被引:0
|
作者
University of Calabria, Italy [1 ]
机构
来源
arXiv | 1600年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Artificial intelligence
引用
收藏
相关论文
共 50 条
  • [21] A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily
    National Key Laboratory for Novel Software Technology, Nanjing University, China
    不详
    arXiv,
  • [22] Pipelines for Social Bias Testing of Large Language Models
    Nozza, Debora
    Bianchi, Federico
    Hovy, Dirk
    PROCEEDINGS OF WORKSHOP ON CHALLENGES & PERSPECTIVES IN CREATING LARGE LANGUAGE MODELS (BIGSCIENCE EPISODE #5), 2022, : 68 - 74
  • [23] A Causal View of Entity Bias in (Large) Language Models
    Wang, Fei
    Mo, Wenjie
    Wang, Yiwei
    Zhou, Wenxuan
    Chen, Muhao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15173 - 15184
  • [24] Cultural bias and cultural alignment of large language models
    Tao, Yan
    Viberg, Olga
    Baker, Ryan S.
    Kizilcec, Rene F.
    PNAS NEXUS, 2024, 3 (09):
  • [25] Locating and Mitigating Gender Bias in Large Language Models
    Cai, Yuchen
    Cao, Ding
    Guo, Rongxi
    Wen, Yaqin
    Liu, Guiquan
    Chen, Enhong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT IV, ICIC 2024, 2024, 14878 : 471 - 482
  • [26] Do Large Language Models Bias Human Evaluations?
    O'Leary, Daniel E.
    IEEE INTELLIGENT SYSTEMS, 2024, 39 (04) : 83 - 87
  • [27] Terahertz Focusing and Polarization Control in Large-Area Bias-Free Semiconductor Emitters
    Joanna L. Carthy
    Paul C. Gow
    Sam A. Berry
    Ben Mills
    Vasilis Apostolopoulos
    Journal of Infrared, Millimeter, and Terahertz Waves, 2018, 39 : 223 - 235
  • [28] Terahertz Focusing and Polarization Control in Large-Area Bias-Free Semiconductor Emitters
    Carthy, Joanna L.
    Gow, Paul C.
    Berry, Sam A.
    Mills, Ben
    Apostolopoulos, Vasilis
    JOURNAL OF INFRARED MILLIMETER AND TERAHERTZ WAVES, 2018, 39 (03) : 223 - 235
  • [29] Learning Bias-Free Representation for Large-Scale Person Re-Identification
    Xu, Jiaming
    Zhu, En
    IEEE ACCESS, 2019, 7 : 143331 - 143346
  • [30] Statistical approach for bias-free identification of a parallel manipulator affected by large measurement noise
    Abdellatif, Houssem
    Heimann, Bodo
    Grotjahn, Martin
    2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 3357 - 3362