Obfuscated Activations Bypass LLM Latent-Space Defenses

被引:0
|
作者
Bailey, Luke [1 ]
Serrano, Alex [2 ]
Sheshadri, Abhay [3 ]
Seleznyov, Mikhail [4 ]
Taylor, Jordan [5 ]
Jenner, Erik [6 ]
Hilton, Jacob [7 ]
Casper, Stephen [8 ]
Guestrin, Carlos [1 ,9 ]
Emmons, Scott [6 ]
机构
[1] Stanford University, United States
[2] Polytechnic University of Catalonia, Spain
[3] Georgia Institute of Technology, United States
[4] Skoltech, Russia
[5] University of Queensland, Australia
[6] UC Berkeley, United States
[7] Alignment Research Center, United States
[8] MIT CSAIL, United States
[9] Chan Zuckerberg Biohub, United States
来源
关键词
Compilation and indexing terms; Copyright 2025 Elsevier Inc;
D O I
暂无
中图分类号
学科分类号
摘要
Data obfuscation
引用
收藏
相关论文
共 47 条
  • [1] Latent-Space Variational Bayes
    Sung, Jaemo
    Ghahramani, Zoubin
    Bang, Sung-Yang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (12) : 2236 - 2242
  • [2] Latent-space Unfolding for MRI Reconstruction
    Jiang, Jiawei
    Feng, Yuchao
    Chen, Jiacheng
    Guo, Dongyan
    Zheng, Jianwei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1294 - 1302
  • [3] A Dynamic Latent-Space Model for Asset Clustering
    Casarin, Roberto
    Peruzzi, Antonio
    STUDIES IN NONLINEAR DYNAMICS AND ECONOMETRICS, 2024, 28 (02): : 379 - 402
  • [4] Latent-space Dynamics for Reduced Deformable Simulation
    Fulton, Lawson
    Modi, Vismay
    Duvenaud, David
    Levin, David I. W.
    Jacobson, Alec
    COMPUTER GRAPHICS FORUM, 2019, 38 (02) : 379 - 391
  • [5] Latent-Space Analysis for Improved Overhead Imagery Explainability
    Bornemann, Joel
    Reisman, Matthew
    Esposito, Steven
    Berrill, Timothy
    Wall, Joshua
    Conway, Todd
    Boone, Jesse
    Soldin, Ryan
    2022 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, AIPR, 2022,
  • [6] Joint image compression and denoising via latent-space scalability
    Alvar, Saeed Ranjbar
    Ulhaq, Mateen
    Choi, Hyomin
    Bajic, Ivan V.
    FRONTIERS IN SIGNAL PROCESSING, 2022, 2
  • [7] Manipulating Image Style Transformation via Latent-Space SVM
    Wang, Qiudan
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1915 - 1923
  • [8] Gaussian Process Encoders: VAEs with Reliable Latent-Space Uncertainty
    Butepage, Judith
    Maystre, Lucas
    Lalmas, Mounia
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 84 - 99
  • [9] Latent-space modeling for reconstruction from time-series
    Hirayama, Jun-Ichiro (hirayama@atr.jp), 1600, Institute of Electronics Information Communication Engineers (97):
  • [10] LATENT-SPACE SCALABILITY FOR MULTI-TASK COLLABORATIVE INTELLIGENCE
    Choi, Hyomin
    Bajic, Ivan, V
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3562 - 3566