AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks

被引:0
|
作者
Zeng, Yifan [1 ]
Wu, Yiran [2 ]
Zhang, Xiao [3 ]
Wang, Huazheng [1 ]
Wu, Qingyun [2 ]
机构
[1] Oregon State University, United States
[2] Pennsylvania State University, United States
[3] CISPA Helmholtz Center for Information Security, Germany
来源
arXiv |
关键词
Agent systems - Filtering mechanism - Language model - Large models - Model agents - Multi agent - Open-source - Performance - Pre-training;
D O I
暂无
中图分类号
学科分类号
摘要
56
引用
收藏
相关论文
共 50 条
  • [21] Adaptive defense coordination for multi-agent systems
    Wells, D
    Pazandak, P
    Nodine, M
    Cassandra, A
    2004 IEEE 1ST SYMPOSIUM ON MULTI-AGENT SECURITY & SURVIVABILITY, 2004, : 118 - 127
  • [22] A Multi-agent Genetic Algorithm for Improving the Robustness of Communities in Complex Networks against Attacks
    Wang, Shuai
    Liu, Jing
    2017 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2017, : 17 - 22
  • [23] Multi-Agent Cooperative Pursuit-Defense Strategy Against One Single Attacker
    Deng, Ziquan
    Kong, Zhaodan
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) : 5772 - 5778
  • [24] An overview on multi-agent consensus under adversarial attacks
    Ishii, Hideaki
    Wang, Yuan
    Feng, Shuai
    ANNUAL REVIEWS IN CONTROL, 2022, 53 : 252 - 272
  • [25] Multi-agent systems with memories under DoS attacks
    Almeida, Ricardo
    Girejko, Ewa
    Machado, Luis
    Malinowska, Agnieszka B.
    Martins, Natalia
    2022 17TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2022, : 771 - 777
  • [26] Distribute Consensus for Multi-agent Systems with Attacks and Delays
    Wu Yiming
    He Xiongxiong
    Ou Xianhua
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 7429 - 7433
  • [27] Possible attacks on and countermeasures for secure multi-agent computation
    Endsuleit, R
    Wagner, A
    SAM '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SECURITY AND MANAGEMENT, 2004, : 221 - 227
  • [28] Advanced Attacks to Trusted Communities in Multi-Agent Systems
    Edenhofer, Sarah
    Kantert, Jan
    Klejnowski, Lukas
    Tomforde, Sven
    Haehner, Joerg
    Mueller-Schloer, Christian
    2014 IEEE EIGHTH INTERNATIONAL CONFERENCE ON SELF-ADAPTIVE AND SELF-ORGANIZING SYSTEMS WORKSHOPS (SASOW), 2014, : 186 - 191
  • [29] Secure output synchronization of heterogeneous multi-agent systems against false data injection attacks
    Shicheng Huo
    Dalin Huang
    Ya Zhang
    Science China Information Sciences, 2022, 65
  • [30] Time-varying formation control for nonlinear multi-agent systems against actuator attacks
    Chang, Zhenyu
    Xue, Hong
    Liang, Hongjing
    Zhang, Pengchao
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2022, 359 (18): : 11068 - 11088