AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks

被引:0
|
作者
Zeng, Yifan [1 ]
Wu, Yiran [2 ]
Zhang, Xiao [3 ]
Wang, Huazheng [1 ]
Wu, Qingyun [2 ]
机构
[1] Oregon State University, United States
[2] Pennsylvania State University, United States
[3] CISPA Helmholtz Center for Information Security, Germany
来源
arXiv |
关键词
Agent systems - Filtering mechanism - Language model - Large models - Model agents - Multi agent - Open-source - Performance - Pre-training;
D O I
暂无
中图分类号
学科分类号
摘要
56
引用
收藏
相关论文
共 50 条
  • [31] Resilient Consensus Control for Linear Multi-agent System Against the False Data Injection Attacks
    Wang, Meirong
    Hu, Jianqiang
    Cao, Jinde
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2023, 21 (07) : 2112 - 2123
  • [32] Distributed Tracking Control of Nonlinear Multi-agent Systems Against False Data Injection Attacks
    Zhang, Yanhui
    Sun, Jian
    Wang, Gang
    Xu, Yong
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4837 - 4842
  • [33] LLM-Driven Social Influence for Cooperative Behavior in Multi-Agent Systems
    de Curto, J.
    de Zarza, I.
    IEEE ACCESS, 2025, 13 : 44330 - 44342
  • [34] Adaptive bipartite output containment control of heterogeneous multi-agent systems against FDI attacks
    Cheng, Jie
    Wu, Jie
    Zhan, Xisheng
    Han, Tao
    Wu, Bo
    ASIAN JOURNAL OF CONTROL, 2024, 26 (06) : 2991 - 3001
  • [35] Multi-Agent Guided Deep Reinforcement Learning Approach Against State Perturbed Adversarial Attacks
    Cerci, Cagri
    Temeltas, Hakan
    IEEE ACCESS, 2024, 12 : 156146 - 156159
  • [36] Model-based resilient control for a multi-agent system against Denial of Service attacks
    Amullen, Esther M.
    Shetty, Sachin
    Keel, Lee H.
    2016 WORLD AUTOMATION CONGRESS (WAC), 2016,
  • [37] Fuzzy observer based adjustable containment control for multi-agent systems against DoS attacks
    Jiang, Mengyi
    Yang, Yonghui
    Liu, Xiaoping
    Wu, Libing
    Gao, Chuang
    NONLINEAR DYNAMICS, 2024, 112 (22) : 20063 - 20080
  • [38] Secure output synchronization of heterogeneous multi-agent systems against false data injection attacks
    Huo, Shicheng
    Huang, Dalin
    Zhang, Ya
    SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (06)
  • [39] Resilient Consensus Control for Linear Multi-agent System Against the False Data Injection Attacks
    Meirong Wang
    Jianqiang Hu
    Jinde Cao
    International Journal of Control, Automation and Systems, 2023, 21 : 2112 - 2123
  • [40] Secure output synchronization of heterogeneous multi-agent systems against false data injection attacks
    Shicheng HUO
    Dalin HUANG
    Ya ZHANG
    Science China(Information Sciences), 2022, 65 (06) : 142 - 154