Foot In The Door: Understanding Large Language Model Jailbreaking via Cognitive Psychology

被引:0
|
作者
National University of Defense Technology, China [1 ]
不详 [2 ]
机构
来源
关键词
Compilation and indexing terms; Copyright 2024 Elsevier Inc;
D O I
暂无
中图分类号
学科分类号
摘要
'current - Black boxes - Cognitive psychology - Consistency theory - Decision-making mechanisms - Language model - Model security - Multisteps - Psychological explanation - Security protection
引用
收藏
相关论文
共 50 条
  • [41] Chinese Text Open Domain Tag Generation Method via Large Language Model
    He, Chunhui
    Ge, Bin
    Zhang, Chong
    2024 10TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS, BIGDIA 2024, 2024, : 183 - 188
  • [42] Explainable automated debugging via large language model-driven scientific debugging
    Kang, Sungmin
    Chen, Bei
    Yoo, Shin
    Lou, Jian-Guang
    EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (02)
  • [43] VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language Model
    Chen, Tianyu
    Li, Lin
    Zhu, Liuchuan
    Li, Zongyang
    Liu, Xueqing
    Liang, Guangtai
    Wang, Qianxiang
    Xie, Tao
    arXiv, 2023,
  • [44] TableGPT: a novel table understanding method based on table recognition and large language model collaborative enhancement
    Ren, Yi
    Yu, Chenglong
    Li, Weibin
    Li, Wei
    Zhu, Zixuan
    Zhang, Tianyi
    Qin, Chenhao
    Ji, Wenbo
    Zhang, Jianjun
    APPLIED INTELLIGENCE, 2025, 55 (04)
  • [45] A Hierarchical Deep Video Understanding Method with Shot-Based Instance Search and Large Language Model
    Li, Ruizhe
    Guo, Jiahao
    Li, Mingxi
    Wu, Zhengqian
    Liang, Chao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9425 - 9429
  • [46] Shadows of wisdom: Classifying meta-cognitive and morally grounded narrative content via large language models
    Stavropoulos, Alexander
    Crone, Damien L.
    Grossmann, Igor
    BEHAVIOR RESEARCH METHODS, 2024, 56 (07) : 7632 - 7646
  • [47] Assessing breast cancer chemotherapy response in radiology and pathology reports via a Large Language Model
    Dodhia, Parth
    Meepagala, Shawn
    Moallem, Golanz
    Rubin, Daniel
    Bean, Gregory
    Rusu, Mirabela
    IMAGING INFORMATICS FOR HEALTHCARE, RESEARCH, AND APPLICATIONS, MEDICAL IMAGING 2024, 2024, 12931
  • [48] SelfCP: Compressing over-limit prompt via the frozen large language model itself
    Gao, Jun
    Cao, Ziqiang
    Li, Wenjie
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (06)
  • [49] Generative AI Agents With Large Language Model for Satellite Networks via a Mixture of Experts Transmission
    Zhang, Ruichen
    Du, Hongyang
    Liu, Yinqiu
    Niyato, Dusit
    Kang, Jiawen
    Xiong, Zehui
    Jamalipour, Abbas
    Kim, Dong In
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2024, 42 (12) : 3581 - 3596
  • [50] A safety realignment framework via subspace-oriented model fusion for large language models
    Yi, Xin
    Zheng, Shunfan
    Wang, Linlin
    Wang, Xiaoling
    He, Liang
    KNOWLEDGE-BASED SYSTEMS, 2024, 306