Hardware-oriented algorithms for softmax and layer normalization of large language models

被引：0

作者：

Li, Wenjie ^{[1
]}

Lyu, Dongxu ^{[1
]}

Wang, Gang ^{[1
]}

Hu, Aokun ^{[1
]}

Xu, Ningyi ^{[1
]}

He, Guanghui ^{[1
,2
,3
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200241, Peoples R China

[2] Shanghai Jiao Tong Univ, Dept Micro Nano Elect, Shanghai 200241, Peoples R China

[3] Shanghai Jiao Tong Univ, MoE Key Lab Artificial Intelligence, Shanghai 200241, Peoples R China

来源：

SCIENCE CHINA-INFORMATION SCIENCES | 2024年 / 67卷 / 10期

基金：

中国国家自然科学基金;

关键词：

large language model; softmax; layer normalization; hardware architecture; Transformer;

D O I：

10.1007/s11432-024-4137-4

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

While large language models (LLMs) have sparked a new revolution in the field of natural language processing (NLP), their hardware accelerators have garnered tremendous attention. However, softmax and layer normalization which are the most common non-linear operations in LLMs are frequently overlooked. This paper presents hardware-oriented algorithms for both softmax and layer normalization of LLMs. We propose an approximate approach to implementing division in softmax and extend it for simultaneously computing square root and performing division in layer normalization. It replaces the original computation by multiplication and shifting. For softmax, we further approximate the exponential function by truncating its exponent and then reuse the involved subtraction. For layer normalization, we additionally simplify the computation of denominator by directly removing the term regarding the square of the mean. Furthermore, hardware architectures are developed for the proposed algorithms of softmax and layer normalization. They can work as plug-and-play units for LLM accelerators, requiring no fine-tuning and introducing negligible performance loss. Compared with the state-of-the-art designs, the proposed softmax architecture can save up to 23.45% area cost and 17.39% power consumption, while the proposed layer normalization architecture can save up to 32.70% area cost and 14.29% power consumption.

引用

页数：15

共 50 条

[31] Hardware Design and Verification with Large Language Models: A Scoping Review, Challenges, and Open Issues
Abdollahi, Meisam
Yeganli, Seyedeh Faegheh
Baharloo, Mohammad
Baniasadi, Amirali
ELECTRONICS, 2025, 14 (01):
[32] Are Large Language Models All You Need for Task-Oriented Dialogue?
Hudecek, Vojtech
Dusek, Ondrej
24TH MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE, SIGDIAL 2023, 2023, : 216 - 228
[33] Enhancing Troubleshooting Task-Oriented Dialog Systems with Large Language Models
Zhou, Jiahao
Zhang, Qiang
Zhang, Fengda
Yuan, Caixia
INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2024, PT VI, 2025, 15206 : 328 - 338
[34] Towards the Integration of Large Language Models in an Object-Oriented Programming Course
Cipriano, Bruno Pereira
PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 2, ITICSE 2024, 2024, : 832 - 833
[35] THINKING ENOUGH? EVALUATING ADVANCED LARGE LANGUAGE MODELS' REASONING ALGORITHMS IN HEOR
Swami, S.
Srivastava, T.
VALUE IN HEALTH, 2024, 27 (12)
[36] Large language models facilitate the generation of electronic health record phenotyping algorithms
Yan, Chao
Ong, Henry H.
Grabowska, Monika E.
Krantz, Matthew S.
Su, Wu-Chen
Dickson, Alyson L.
Peterson, Josh F.
Feng, QiPing
Roden, Dan M.
Stein, C. Michael
Kerchberger, V. Eric
Malin, Bradley A.
Wei, Wei-Qi
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1994 - 2001
[37] When Large Language Models Meet Evolutionary Algorithms: Potential Enhancements and Challenges
Wang, Chao
Zhao, Jiaxuan
Jiao, Licheng
Li, Lingling
Liu, Fang
Yang, Shuyuan
RESEARCH, 2025, 8
[38] Jailbreaking Pre-trained Large Language Models Towards Hardware Vulnerability Insertion Ability
Wan, Gwok-Waa
Wong, Sam-Zaak
Wang, Xi
PROCEEDING OF THE GREAT LAKES SYMPOSIUM ON VLSI 2024, GLSVLSI 2024, 2024, : 579 - 582
[39] Integrating Large Language Models and Metaverse in Autonomous Racing: An Education-Oriented Perspective
Li, Bai
Xu, Tian'ao
Li, Xinyuan
Cui, Yaodong
Bian, Xuepeng
Teng, Siyu
Ma, Siji
Fan, Lili
Tian, Yonglin
Wang, Fei-Yue
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 59 - 64
[40] A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models
Wu, Yixi
He, Pengfei
Wang, Zehao
Wang, Shaowei
Tian, Yuan
Chen, Tse-Hsun
arXiv,

← 1 2 3 4 5 →