Visual Adversarial Examples Jailbreak Aligned Large Language Models

被引:0
|
作者
Princeton University, United States [1 ]
机构
来源
Proc. AAAI Conf. Artif. Intell. | / 19卷 / 21527-21536期
关键词
Computational linguistics;
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
  • [1] Visual Adversarial Examples Jailbreak Aligned Large Language Models
    Qi, Xiangyu
    Huang, Kaixuan
    Panda, Ashwinee
    Henderson, Peter
    Wang, Mengdi
    Mittal, Prateek
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21527 - 21536
  • [2] JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models
    Feng, Yingchaojie
    Chen, Zhizhang
    Kang, Zhining
    Wang, Sijia
    Zhu, Minfeng
    Zhang, Wei
    Chen, Wei
    arXiv,
  • [3] MULTILINGUAL JAILBREAK CHALLENGES IN LARGE LANGUAGE MODELS
    Deng, Yue
    Zhang, Wenxuan
    Pan, Sinno Jialin
    Bing, Lidong
    arXiv, 2023,
  • [4] Jailbreak Attack for Large Language Models: A Survey
    Li N.
    Ding Y.
    Jiang H.
    Niu J.
    Yi P.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (05): : 1156 - 1181
  • [5] Generating Valid and Natural Adversarial Examples with Large Language Models
    Wang, Zimu
    Wang, Wei
    Chen, Qi
    Wang, Qiufeng
    Anh Nguyen
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1716 - 1721
  • [6] Faster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language Models
    Li, Xiao
    Li, Zhuhong
    Li, Qiongxiu
    Lee, Bingze
    Cui, Jinghao
    Hu, Xiaolin
    arXiv,
  • [7] Generating Natural Language Adversarial Examples on a Large Scale with Generative Models
    Ren, Yankun
    Lin, Jianbin
    Tang, Siliang
    Zhou, Jun
    Yang, Shuang
    Qi, Yuan
    Ren, Xiang
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2156 - 2163
  • [8] Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
    University of Calabria, Italy
    arXiv, 1600,
  • [9] Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
    Cantini, Riccardo
    Cosenza, Giada
    Orsino, Alessio
    Talia, Domenico
    DISCOVERY SCIENCE, DS 2024, PT I, 2025, 15243 : 52 - 68
  • [10] Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
    Wei, Zeming
    Wang, Yifei
    Li, Ang
    Mo, Yichuan
    Wang, Yisen
    arXiv, 2023,