Project-specific code summarization with in-context learning

被引：0

作者：

Yun, Shangbo ^{[1
]}

Lin, Shuhuai ^{[2
]}

Gu, Xiaodong ^{[1
]}

Shen, Beijun ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Software, Shanghai, Peoples R China

[2] Carnegie Mellon Univ, Dept Elect & Comp Engn, Mountain View, CA USA

来源：

JOURNAL OF SYSTEMS AND SOFTWARE | 2024年 / 216卷

基金：

国家重点研发计划;

关键词：

Prompt generation; Project-specific code summarization; Large language model; In-context learning;

D O I：

10.1016/j.jss.2024.112149

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Automatically generating summaries for source code has emerged as a valuable task in software development. While state-of-the-art (SOTA) approaches have demonstrated significant efficacy in summarizing general code, they seldom concern code summarization for a specific project. Project-specific code summarization (PCS) poses special challenges due to the scarce availability of training data and the unique styles of different projects. In this paper, we empirically analyze the performance of Large Language Models (LLMs) on PCS tasks. Our study reveals that using appropriate prompts is an effective way to solicit LLMs for generating project-specific code summaries. Based on these findings, we propose a novel project-specific code summarization approach called P-CodeSum. P-CodeSum gathers a repository-level pool of (code, summary) examples to characterize the project-specific features. Then, it trains a neural prompt selector on a high-quality dataset crafted by LLMs using the example pool. The prompt selector offers relevant and high-quality prompts for LLMs to generate project- specific summaries. We evaluate against a variety of baseline approaches on six PCS datasets. Experimental results show that the P-CodeSum improves the performance by 5.9% (RLPG) to 101.51% (CodeBERT) on BLEU-4 compared to the state-of-the-art approaches in project-specific code summarization.

引用

页数：11

共 50 条

[1] Code Summarization with Project-Specific Features
Wang, Yu
Liu, Xin
Lu, Xuesong
Zhou, Aoying
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT IX, ECML PKDD 2024, 2024, 14949 : 190 - 206
[2] Active Learning for Low-Resource Project-Specific Code Summarization
Xing, Chengli
Hu, Tianxiang
Liao, Ninglin
Zhang, Minghui
Du, Dongdong
Wu, Yupeng
Gao, Qing
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT V, KSEM 2024, 2024, 14888 : 48 - 57
[3] Low-Resources Project-Specific Code Summarization
Xie, Rui
Hu, Tianxiang
Ye, Wei
Zhang, Shikun
PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
[4] Few-shot training LLMs for project-specific code-summarization
Ahmed, Toufique
Devanbu, Premkumar
PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
[5] Multi-Dimensional Evaluation of Text Summarization with In-Context Learning
Jain, Sameer
Keshava, Vaishakh
Sathyendral, Swarnashree Mysore
Fernandes, Patrick
Liu, Pengfei
Neubig, Graham
Zhou, Chunting
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8487 - 8495
[6] In-Context In-Context Learning with Transformer Neural Processes
Ashman, Matthew
Diaconu, Cristiana
Weller, Adrian
Turner, Richard E.
SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 1 - 29
[7] Project-specific Formwork Solution
不详
BAUINGENIEUR, 2018, 93 : A14 - A15
[8] Acquisition of a project-specific process
Jaufman, O
Münch, R
PRODUCT FOCUSED SOFTWARE PROCESS IMPROVEMENT, PROCEEDINGS, 2005, 3547 : 328 - 342
[9] A glance at in-context learning
Wu, Yongliang
Yang, Xu
FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (05)
[10] The Learnability of In-Context Learning
Wies, Noam
Levine, Yoav
Shashua, Amnon
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →