Operating Conversational Large Language Models (LLMs)in the Presenceof Errors

被引：0

作者：

Gao, Zhen ^{[1
]}

Deng, Jie ^{[2
]}

Reviriego, Pedro ^{[3
]}

Liu, Shanshan ^{[4
]}

Pozo, Alejando ^{[3
]}

Lombardi, Fabrizio ^{[5
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Tianjin Univ, Sch Future Technol, Tianjin 300072, Peoples R China

[3] Univ Politecn Madrid, ETSI Telecomunicac, Madrid 28040, Spain

[4] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China

[5] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA

来源：

IEEE NANOTECHNOLOGY MAGAZINE | 2025年 / 19卷 / 01期

关键词：

Quantization (signal); Benchmark testing; Transformers; Codes; Translation; Memory management; Logic gates; Integrated circuit modeling; Hardware; Computational modeling; Dependability; generative artificial intelligence; large language models; errors;

D O I：

10.1109/MNANO.2024.3513112

中图分类号：

TB3 [工程材料学];

学科分类号：

0805 ; 080502 ;

摘要：

Conversational Large Language Models have taken the center stage of the artificial intelligence landscape. As they are pervasive, there is a need to evaluate their dependability, i.e., performance when errors appear due to the underlying hardware implementation. In this paper we consider the evaluation of the dependability of a widely used conversational LLM: Mistral-7B. Error injection is conducted, and the Multitask Language Understanding (MMLU) benchmark is used to evaluate the impact on performance. The drop in the percentage of correct answers due to errors is analyzed and the results provide interesting insights: Mistral-7B has a large intrinsic tolerance to errors even at high bit error rates. This opens the door to the use of nanotechnologies that trade-off errors for energy dissipation and complexity to further improve the LLM implementation. Also, the error tolerance is larger for 8-bit quantization than for 4-bit quantization, so suggesting that there will be also a trade-off between quantization optimizations to reduce memory requirements and error tolerance. In addition, we also show the different impact of errors on different types of weights, which is valuable information for selective protection designs.

引用

页码：31 / 37

页数：7

共 50 条

[41] Large language models (LLMs) in radiology exams for medical students: Performance and consequences
Gotta, Jennifer
Hong, Quang Anh Le
Koch, Vitali
Gruenewald, Leon D.
Geyer, Tobias
Martin, Simon S.
Scholtz, Jan-Erik
Booz, Christian
Dos Santos, Daniel Pinto
Mahmoudi, Scherwin
Eichler, Katrin
Gruber-Rouh, Tatjana
Hammerstingl, Renate
Biciusca, Teodora
Juergens, Lisa Joy
Hoehne, Elena
Mader, Christoph
Vogl, Thomas J.
Reschke, Philipp
ROFO-FORTSCHRITTE AUF DEM GEBIET DER RONTGENSTRAHLEN UND DER BILDGEBENDEN VERFAHREN, 2024,
[42] Capabilities and limitations of AI Large Language Models (LLMs) for materials criticality research
Ku, Anthony Y.
Hool, Alessandra
MINERAL ECONOMICS, 2024,
[43] Enabling access to large-language models (LLMs) at scale for higher education
Nadel, Peter
Maloney, Delilah
Monahan, Kyle M.
PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING 2024, PEARC 2024, 2024,
[44] The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs)
Haltaufderheide, Joschka
Ranisch, Robert
NPJ DIGITAL MEDICINE, 2024, 7 (01):
[45] Game of LLMs: Discovering Structural Constructs in Activities using Large Language Models
Hiremath, Shruthi K.
Plotz, Thomas
COMPANION OF THE 2024 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING, UBICOMP COMPANION 2024, 2024, : 487 - 492
[46] LLMs, Turing tests and Chinese rooms: the prospects for meaning in large language models
Borg, Emma
INQUIRY-AN INTERDISCIPLINARY JOURNAL OF PHILOSOPHY, 2025,
[47] Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges
Kumar, Pranjal
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
[48] Mitigating Insecure Outputs in Large Language Models(LLMs): A Practical Educational Module
Barek, Md Abdul
Rahman, Md Mostafizur
Akter, Mst Shapna
Riad, A. B. M. Kamrul Islam
Rahman, Md Abdur
Shahriar, Hossain
Rahman, Akond
Wu, Fan
2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 2424 - 2429
[49] Legal large language models (LLMs): legal dynamos or “fancifully packaged ChatGPT”?
Fife Ogunde
Discover Artificial Intelligence, 5 (1):
[50] EchoSwift An Inference Benchmarking and Configuration Discovery Tool for Large Language Models (LLMs)
Krishna, Karthik
Bandili, Ramana
COMPANION OF THE 15TH ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE COMPANION 2024, 2024, : 158 - 162

← 1 2 3 4 5 →