共 50 条
- [2] Defending Large Language Models Against Jailbreak Attacks via Layer-specific Editing EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024, 2024, : 5094 - 5109
- [4] Adversarial Attacks on Large Language Models KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2024, 2024, 14887 : 85 - 96
- [7] Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks RESEARCH IN ATTACKS, INTRUSIONS, AND DEFENSES, RAID 2018, 2018, 11050 : 273 - 294
- [8] Prompt Engineering: Unleashing the Power of Large Language Models to Defend Against Social Engineering Attacks Iraqi Journal for Computer Science and Mathematics, 2024, 5 (03): : 404 - 416
- [9] Stealing the Decoding Algorithms of Language Models PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1835 - 1849
- [10] Data Poisoning Attacks against Autoregressive Models THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1452 - 1458