共 9 条
- [1] Defending Large Language Models Against Jailbreak Attacks via Layer-specific Editing EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024, 2024, : 5094 - 5109
- [3] Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization 2024 ACM/IEEE 51ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2024, 2024, : 1048 - 1062
- [5] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11437 - 11452
- [6] Accelerating Sparse Autoencoder Training via Layer-Wise Transfer Learning in Large Language Models BlackboxNLP 2024 - 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP - Proceedings of the Workshop, 2024, : 530 - 550
- [9] Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 10383 - 10405