Towards Safer Large Language Models (LLMs)

被引:0
|
作者
Lawrence, Carolin [1 ]
Bifulco, Roberto [1 ]
Gashteovski, Kiril [1 ]
Hung, Chia-Chien [1 ]
Ben Rim, Wiem [1 ]
Shaker, Ammar [1 ]
Oyamada, Masafumi [2 ]
Sadamasa, Kunihiko [2 ]
Enomoto, Masafumi [2 ]
Takeoka, Kunihiro [2 ]
机构
[1] NEC Laboratories Europe, Germany
[2] Data Science Laboratories
来源
NEC Technical Journal | 2024年 / 17卷 / 02期
关键词
Computational linguistics - Risk assessment;
D O I
暂无
中图分类号
学科分类号
摘要
Large Language Models (LLMs) are revolutionizing our world. They have impressive textual capabilities that will fundamentally change how human users can interact with intelligent systems. Nonetheless, they also still have a series of limitations that are important to keep in mind when working with LLMs. We explore how these limitations can be addressed from two different angles. First, we look at options that are currently already available, which include (1) assessing the risk of a use case, (2) prompting a LLM to deliver explanations and (3) encasing LLMs in a human-centred system design. Second, we look at technologies that we are currently developing, which will be able to (1) more accurately assess the quality of an LLM for a high-risk domain, (2) explain the generated LLM output by linking to the input and (3) fact check the generated LLM output against external trustworthy sources. © 2024 NEC Mediaproducts. All rights reserved.
引用
收藏
页码:64 / 74
相关论文
共 50 条
  • [31] Towards the regulation of Large Language Models (LLMs) and Generative AI use in the Brazilian Government: the case of a State Court of Accounts
    Alves, Karine
    Santos, Edney
    Silva, Matheus Fidelis
    Chaves, Ana Carolina
    Fernandes, Jose Andre
    Valenca, George
    Brito, Kellyton
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON THEORY AND PRACTICE OF ELECTRONIC GOVERNANCE, ICEGOV 2024, 2024, : 28 - 35
  • [32] Towards Trustworthy Large Language Models
    Koyejo, Sanmi
    Li, Bo
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1126 - 1127
  • [33] Large Language Models (LLMs) as Graphing Tools for Advanced Chemistry Education and Research
    Subasinghe, S. M. Supundrika
    Gersib, Simon G.
    Mankad, Neal P.
    JOURNAL OF CHEMICAL EDUCATION, 2025,
  • [34] Capabilities and limitations of AI Large Language Models (LLMs) for materials criticality research
    Ku, Anthony Y.
    Hool, Alessandra
    MINERAL ECONOMICS, 2024,
  • [35] Content Knowledge Identification with Multi-agent Large Language Models (LLMs)
    Yang, Kaiqi
    Chu, Yucheng
    Darwin, Taylor
    Han, Ahreum
    Li, Hang
    Wen, Hongzhi
    Copur-Gencturk, Yasemin
    Tang, Jiliang
    Liu, Hui
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT II, AIED 2024, 2024, 14830 : 284 - 292
  • [36] Large language models (LLMs) in radiology exams for medical students: Performance and consequences
    Gotta, Jennifer
    Hong, Quang Anh Le
    Koch, Vitali
    Gruenewald, Leon D.
    Geyer, Tobias
    Martin, Simon S.
    Scholtz, Jan-Erik
    Booz, Christian
    Dos Santos, Daniel Pinto
    Mahmoudi, Scherwin
    Eichler, Katrin
    Gruber-Rouh, Tatjana
    Hammerstingl, Renate
    Biciusca, Teodora
    Juergens, Lisa Joy
    Hoehne, Elena
    Mader, Christoph
    Vogl, Thomas J.
    Reschke, Philipp
    ROFO-FORTSCHRITTE AUF DEM GEBIET DER RONTGENSTRAHLEN UND DER BILDGEBENDEN VERFAHREN, 2024,
  • [37] The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs)
    Haltaufderheide, Joschka
    Ranisch, Robert
    NPJ DIGITAL MEDICINE, 2024, 7 (01):
  • [38] LLMs, Turing tests and Chinese rooms: the prospects for meaning in large language models
    Borg, Emma
    INQUIRY-AN INTERDISCIPLINARY JOURNAL OF PHILOSOPHY, 2025,
  • [39] Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges
    Kumar, Pranjal
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
  • [40] Game of LLMs: Discovering Structural Constructs in Activities using Large Language Models
    Hiremath, Shruthi K.
    Plotz, Thomas
    COMPANION OF THE 2024 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING, UBICOMP COMPANION 2024, 2024, : 487 - 492