Towards Greener Yet Powerful Code Generation via Quantization: An Empirical Study

被引:2
|
作者
Wei, Xiaokai [1 ]
Gonugondla, Sujan Kumar [1 ]
Wang, Shiqi [1 ]
Ahmad, Wasi [1 ]
Ray, Baishakhi [1 ]
Qian, Haifeng [1 ]
Li, Xiaopeng [1 ]
Kumar, Varun [1 ]
Wang, Zijian [1 ]
Tian, Yuchen [1 ]
Sun, Qing [1 ]
Athiwaratkun, Ben [1 ]
Shang, Mingyue [1 ]
Ramanathan, Murali Krishna [1 ]
Bhatia, Parminder [1 ]
Xiang, Bing [1 ]
机构
[1] AWS AI Labs, Palo Alto, CA 94303 USA
关键词
Quantization; Code Generation; Large Language Models; Generative AI; Model Hosting;
D O I
10.1145/3611643.3616302
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
ML-powered code generation aims to assist developers to write code in a more productive manner by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have pushed the boundary of code generation and achieved impressive performance.] However, the huge number of model parameters poses a significant challenge to their adoption in a typical software development environment, where a developer might use a standard laptop or mid-size server to develop code. Such large models cost significant resources in terms of memory, latency, dollars, as well as carbon footprint. Model compression is a promising approach to address these challenges. We have identified quantization as one of the most promising compression techniques for code-generation as it avoids expensive retraining costs. As quantization represents model parameters with lower-bit integer (e.g., int8), the model size and runtime latency would both benefit.] We empirically evaluate quantized models on code generation tasks across different dimensions: (i) resource usage and carbon footprint, (ii) accuracy, and (iii) robustness. Through systematic experiments we find a code-aware quantization recipe that could run even a 6-billion-parameter model in a regular laptop without significant accuracy or robustness degradation. We find that the recipe is readily applicable to code summarization task as well.
引用
收藏
页码:224 / 236
页数:13
相关论文
共 21 条
  • [21] Towards optimizing carbapenem selection in stewardship strategies: a prospective propensity score-matched study of ertapenem versus class 2 carbapenems for empirical treatment of third-generation cephalosporin-resistant Enterobacterales bacteraemia
    Vasikasin, Vasin
    Panuvatvanich, Bawornnan
    Rawson, Timothy M.
    Holmes, Alison H.
    Nasomsong, Worapong
    JOURNAL OF ANTIMICROBIAL CHEMOTHERAPY, 2023, 78 (07) : 1748 - 1756