Project-specific code summarization with in-context learning

被引:0
|
作者
Yun, Shangbo [1 ]
Lin, Shuhuai [2 ]
Gu, Xiaodong [1 ]
Shen, Beijun [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Software, Shanghai, Peoples R China
[2] Carnegie Mellon Univ, Dept Elect & Comp Engn, Mountain View, CA USA
基金
国家重点研发计划;
关键词
Prompt generation; Project-specific code summarization; Large language model; In-context learning;
D O I
10.1016/j.jss.2024.112149
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automatically generating summaries for source code has emerged as a valuable task in software development. While state-of-the-art (SOTA) approaches have demonstrated significant efficacy in summarizing general code, they seldom concern code summarization for a specific project. Project-specific code summarization (PCS) poses special challenges due to the scarce availability of training data and the unique styles of different projects. In this paper, we empirically analyze the performance of Large Language Models (LLMs) on PCS tasks. Our study reveals that using appropriate prompts is an effective way to solicit LLMs for generating project-specific code summaries. Based on these findings, we propose a novel project-specific code summarization approach called P-CodeSum. P-CodeSum gathers a repository-level pool of (code, summary) examples to characterize the project-specific features. Then, it trains a neural prompt selector on a high-quality dataset crafted by LLMs using the example pool. The prompt selector offers relevant and high-quality prompts for LLMs to generate project- specific summaries. We evaluate against a variety of baseline approaches on six PCS datasets. Experimental results show that the P-CodeSum improves the performance by 5.9% (RLPG) to 101.51% (CodeBERT) on BLEU-4 compared to the state-of-the-art approaches in project-specific code summarization.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Code Summarization with Project-Specific Features
    Wang, Yu
    Liu, Xin
    Lu, Xuesong
    Zhou, Aoying
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT IX, ECML PKDD 2024, 2024, 14949 : 190 - 206
  • [2] Active Learning for Low-Resource Project-Specific Code Summarization
    Xing, Chengli
    Hu, Tianxiang
    Liao, Ninglin
    Zhang, Minghui
    Du, Dongdong
    Wu, Yupeng
    Gao, Qing
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT V, KSEM 2024, 2024, 14888 : 48 - 57
  • [3] Low-Resources Project-Specific Code Summarization
    Xie, Rui
    Hu, Tianxiang
    Ye, Wei
    Zhang, Shikun
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [4] Few-shot training LLMs for project-specific code-summarization
    Ahmed, Toufique
    Devanbu, Premkumar
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [5] Multi-Dimensional Evaluation of Text Summarization with In-Context Learning
    Jain, Sameer
    Keshava, Vaishakh
    Sathyendral, Swarnashree Mysore
    Fernandes, Patrick
    Liu, Pengfei
    Neubig, Graham
    Zhou, Chunting
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8487 - 8495
  • [6] In-Context In-Context Learning with Transformer Neural Processes
    Ashman, Matthew
    Diaconu, Cristiana
    Weller, Adrian
    Turner, Richard E.
    SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 1 - 29
  • [7] Project-specific Formwork Solution
    不详
    BAUINGENIEUR, 2018, 93 : A14 - A15
  • [8] Acquisition of a project-specific process
    Jaufman, O
    Münch, R
    PRODUCT FOCUSED SOFTWARE PROCESS IMPROVEMENT, PROCEEDINGS, 2005, 3547 : 328 - 342
  • [9] A glance at in-context learning
    Wu, Yongliang
    Yang, Xu
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (05)
  • [10] The Learnability of In-Context Learning
    Wies, Noam
    Levine, Yoav
    Shashua, Amnon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,