Towards Greener Yet Powerful Code Generation via Quantization: An Empirical Study

被引：2

作者：

Wei, Xiaokai ^{[1
]}

Gonugondla, Sujan Kumar ^{[1
]}

Wang, Shiqi ^{[1
]}

Ahmad, Wasi ^{[1
]}

Ray, Baishakhi ^{[1
]}

Qian, Haifeng ^{[1
]}

Li, Xiaopeng ^{[1
]}

Kumar, Varun ^{[1
]}

Wang, Zijian ^{[1
]}

Tian, Yuchen ^{[1
]}

Sun, Qing ^{[1
]}

Athiwaratkun, Ben ^{[1
]}

Shang, Mingyue ^{[1
]}

Ramanathan, Murali Krishna ^{[1
]}

Bhatia, Parminder ^{[1
]}

Xiang, Bing ^{[1
]}

机构：

[1] AWS AI Labs, Palo Alto, CA 94303 USA

来源：

PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023 | 2023年

关键词：

Quantization; Code Generation; Large Language Models; Generative AI; Model Hosting;

D O I：

10.1145/3611643.3616302

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

ML-powered code generation aims to assist developers to write code in a more productive manner by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have pushed the boundary of code generation and achieved impressive performance.] However, the huge number of model parameters poses a significant challenge to their adoption in a typical software development environment, where a developer might use a standard laptop or mid-size server to develop code. Such large models cost significant resources in terms of memory, latency, dollars, as well as carbon footprint. Model compression is a promising approach to address these challenges. We have identified quantization as one of the most promising compression techniques for code-generation as it avoids expensive retraining costs. As quantization represents model parameters with lower-bit integer (e.g., int8), the model size and runtime latency would both benefit.] We empirically evaluate quantized models on code generation tasks across different dimensions: (i) resource usage and carbon footprint, (ii) accuracy, and (iii) robustness. Through systematic experiments we find a code-aware quantization recipe that could run even a 6-billion-parameter model in a regular laptop without significant accuracy or robustness degradation. We find that the recipe is readily applicable to code summarization task as well.

引用

页码：224 / 236

页数：13

共 21 条

[1] Towards Usable Neural Comment Generation via Code-Comment Linkage Interpretation: Method and Empirical Study
Jiang, Shuyao
Shen, Jiacheng
Wu, Shengnan
Cai, Yu
Yu, Yue
Zhou, Yangfan
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (04) : 2239 - 2254
[2] Learn to Code Sustainably: An Empirical Study on Green Code Generation
Vartziotis, Tina
Dellatolas, Ippolyti
Dasoulas, George
Schmidt, Maximilian
Schneider, Florian
Hoffmann, Tim
Kotsopoulos, Sotirios
Keckeisen, Michael
2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024, 2024, : 30 - 37
[3] An empirical study on electric vehicle adoption in India: A step towards a greener environment
Jain, Monika
Singh, Archana
TRANSPORT POLICY, 2024, 156 : 112 - 125
[4] An Empirical Study of Code Smells in Transformer-based Code Generation Techniques
Siddiq, Mohammed Latif
Majumder, Shafayat H.
Mim, Maisha R.
Jajodia, Sourov
Santos, Joanna C. S.
2022 IEEE 22ND INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM 2022), 2022, : 71 - 82
[5] On the Robustness of Code Generation Techniques: An Empirical Study on GitHub Copilot
Mastropaolo, Antonio
Pascarella, Luca
Guglielmi, Emanuela
Ciniselli, Matteo
Scalabrino, Simone
Oliveto, Rocco
Bavota, Gabriele
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2149 - 2160
[6] An Empirical Study of the Non-Determinism of ChatGPT in Code Generation
Ouyang, Shuyin
Zhang, Jie m.
Harman, Mark
Wang, Meng
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2025, 34 (02)
[7] Towards Vietnamese Question and Answer Generation: An Empirical Study
Pham, Quoc-Hung
Le, Huu-Loi
Nhat, Minh Dang
Tran, T. Khang
Tran-Tien, Manh
Dang, Viet-Hung
Vu, Huy-The
Nguyen, Minh-Tien
Phan, Xuan-Hieu
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (09)
[8] Towards understanding code review practices for infrastructure-as-code: An empirical study on OpenStack projects
Narjes Bessghaier
Ali Ouni
Mohammed Sayagh
Moataz Chouchen
Mohamed Wiem Mkaouer
Empirical Software Engineering, 2025, 30 (3)
[9] Towards provably correct code generation via horn logical continuation semantics
Wang, Q
Gupta, G
Leuschel, M
PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES, PROCEEDINGS, 2005, 3350 : 98 - 112
[10] Generating Realistic Vulnerabilities via Neural Code Editing: An Empirical Study
Nong, Yu
Ou, Yuzhe
Pradel, Michael
Chen, Feng
Cai, Haipeng
PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 1097 - 1109

← 1 2 3 →