Mitigating Exaggerated Safety in Large Language Models

被引:0
|
作者
Ray, Ruchira [1 ]
Bhalani, Ruchi [1 ]
机构
[1] University of Texas at Austin, Department of Computer Science, United States
来源
arXiv |
关键词
Compilation and indexing terms; Copyright 2025 Elsevier Inc;
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
  • [41] Identifying Exaggerated Language
    Kong, Li
    Li, Chuanyi
    Ge, Jidong
    Luo, Bin
    Ng, Vincent
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7024 - 7034
  • [42] Large Language Models in der WissenschaftLarge language models in science
    Karl-Friedrich Kowalewski
    Severin Rodler
    Die Urologie, 2024, 63 (9) : 860 - 866
  • [43] Mitigating Hallucinations in Large Language Models via Semantic Enrichment of Prompts: Insights from BioBERT and Ontological Integration
    Penkov, Stanislav
    PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA, CLIB 2024, 2024, : 272 - 276
  • [44] The Importance of Understanding Language in Large Language Models
    Youssef, Alaa
    Stein, Samantha
    Clapp, Justin
    Magnus, David
    AMERICAN JOURNAL OF BIOETHICS, 2023, 23 (10): : 6 - 7
  • [45] Dissociating language and thought in large language models
    Mahowald, Kyle
    Ivanova, Anna A.
    Blank, Idan A.
    Kanwisher, Nancy
    Tenenbaum, Joshua B.
    Fedorenko, Evelina
    TRENDS IN COGNITIVE SCIENCES, 2024, 28 (06) : 517 - 540
  • [46] On the creativity of large language models
    Franceschelli, Giorgio
    Musolesi, Mirco
    AI & SOCIETY, 2024,
  • [47] Large Language Models in Cyberattacks
    S. V. Lebed
    D. E. Namiot
    E. V. Zubareva
    P. V. Khenkin
    A. A. Vorobeva
    D. A. Svichkar
    Doklady Mathematics, 2024, 110 (Suppl 2) : S510 - S520
  • [48] Large language models and psychiatry
    Orru, Graziella
    Melis, Giulia
    Sartori, Giuseppe
    INTERNATIONAL JOURNAL OF LAW AND PSYCHIATRY, 2025, 101
  • [49] Autoformalization with Large Language Models
    Wu, Yuhuai
    Jiang, Albert Q.
    Li, Wenda
    Rabe, Markus N.
    Staats, Charles
    Jamnik, Mateja
    Szegedy, Christian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [50] Imitation and Large Language Models
    Boisseau, Eloise
    MINDS AND MACHINES, 2024, 34 (04)