Project-specific code summarization with in-context learning

被引:0
|
作者
Yun, Shangbo [1 ]
Lin, Shuhuai [2 ]
Gu, Xiaodong [1 ]
Shen, Beijun [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Software, Shanghai, Peoples R China
[2] Carnegie Mellon Univ, Dept Elect & Comp Engn, Mountain View, CA USA
基金
国家重点研发计划;
关键词
Prompt generation; Project-specific code summarization; Large language model; In-context learning;
D O I
10.1016/j.jss.2024.112149
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automatically generating summaries for source code has emerged as a valuable task in software development. While state-of-the-art (SOTA) approaches have demonstrated significant efficacy in summarizing general code, they seldom concern code summarization for a specific project. Project-specific code summarization (PCS) poses special challenges due to the scarce availability of training data and the unique styles of different projects. In this paper, we empirically analyze the performance of Large Language Models (LLMs) on PCS tasks. Our study reveals that using appropriate prompts is an effective way to solicit LLMs for generating project-specific code summaries. Based on these findings, we propose a novel project-specific code summarization approach called P-CodeSum. P-CodeSum gathers a repository-level pool of (code, summary) examples to characterize the project-specific features. Then, it trains a neural prompt selector on a high-quality dataset crafted by LLMs using the example pool. The prompt selector offers relevant and high-quality prompts for LLMs to generate project- specific summaries. We evaluate against a variety of baseline approaches on six PCS datasets. Experimental results show that the P-CodeSum improves the performance by 5.9% (RLPG) to 101.51% (CodeBERT) on BLEU-4 compared to the state-of-the-art approaches in project-specific code summarization.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Generative Calibration for In-context Learning
    Jiang, Zhongtao
    Zhang, Yuanzhe
    Liu, Cao
    Zhao, Jun
    Liu, Kang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2312 - 2333
  • [22] Distinguishability Calibration to In-Context Learning
    Li, Hongjing
    Yan, Hanqi
    Li, Yanran
    Qian, Li
    He, Yulan
    Gui, Lin
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1385 - 1397
  • [23] Requirements Satisfiability with In-Context Learning
    Santos, Sarah
    Breaux, Travis
    Norton, Thomas
    Haghighi, Sara
    Ghanavati, Sepideh
    32ND IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE, RE 2024, 2024, : 168 - 179
  • [24] Is Mamba Capable of In-Context Learning?
    Grazzi, Riccardo
    Siems, Julien
    Schrodi, Simon
    Brox, Thomas
    Hutter, Frank
    INTERNATIONAL CONFERENCE ON AUTOMATED MACHINE LEARNING, 2024, 256
  • [25] Context-based transfer learning for low resource code summarization
    Guo, Yi
    Chai, Yu
    Zhang, Lehuan
    Li, Hui
    Luo, Mengzhi
    Guo, Shikai
    SOFTWARE-PRACTICE & EXPERIENCE, 2024, 54 (03): : 465 - 482
  • [26] Project-specific process configuration in virtual enterprises
    Rupprecht, C
    Rose, T
    van Halm, E
    Zwegers, A
    GLOBAL ENGINEERING, MANUFACTURING AND ENTERPRISE NETWORKS, 2001, 63 : 46 - 53
  • [27] Acquisition of Project-Specific Assets with Bayesian Updating
    Kwon, H. Dharma
    Lippman, Steven A.
    OPERATIONS RESEARCH, 2011, 59 (05) : 1119 - 1130
  • [28] Automated Support for the Project-Specific Instantiation of Standards
    Feather, Martin S.
    Cornford, Steven L.
    DiVenti, Anthony J.
    Evans, John W.
    2022 68TH ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM (RAMS 2022), 2022,
  • [29] Project-specific knowledge bases in AEC industry
    Howard, H.Craig
    Journal of Computing in Civil Engineering, 1991, 5 (01) : 25 - 41
  • [30] An Empirical Study on Project-Specific Traceability Strategies
    Rempel, Patrick
    Mcder, Patrick
    Kuschke, Tobias
    2013 21ST IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE), 2013, : 195 - 204