CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

被引:68
|
作者
Yao, Ziyu [1 ]
Peddamail, Jayavardhan Reddy [1 ]
Sun, Huan [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
关键词
Code Annotation; Code Retrieval; Reinforcement Learning;
D O I
10.1145/3308558.3313632
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To accelerate software development, much research has been performed to help people understand and reuse the huge amount of available code resources. Two important tasks have been widely studied: code retrieval, which aims to retrieve code snippets relevant to a given natural language query from a code base, and code annotation, where the goal is to annotate a code snippet with a natural language description. Despite their advancement in recent years, the two tasks are mostly explored separately. In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called "CoaCor"), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others. To this end, we propose an effective framework based on reinforcement learning, which explicitly encourages the code annotation model to generate annotations that can be used for the retrieval task. Through extensive experiments, we show that code annotations generated by our framework are much more detailed and more useful for code retrieval, and they can further improve the performance of existing code retrieval models significantly.(1)
引用
收藏
页码:2203 / 2214
页数:12
相关论文
共 50 条
  • [1] AutoAnnotate: Reinforcement Learning based Code Annotation for High Level Synthesis
    Shahzad, Hafsah
    Sanaullah, Ahmed
    Arora, Sanjay
    Drepper, Uli
    Herbordt, Martin
    2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
  • [2] Reinforcement Learning of Code Search Sessions
    Li, Wei
    Yan, Shuhan
    Shen, Beijun
    Chen, Yuting
    2019 26TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), 2019, : 458 - 465
  • [3] Boosting Code Search with Structural Code Annotation
    Kong, Xianglong
    Chen, Hongyu
    Yu, Ming
    Zhang, Lixiang
    ELECTRONICS, 2022, 11 (19)
  • [4] Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning
    Ye, Wei
    Xie, Rui
    Zhang, Jinglei
    Hu, Tianxiang
    Wang, Xiaoyin
    Zhang, Shikun
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 2309 - 2319
  • [5] IRMA Code II: Unique Annotation of Medical Images for Access and Retrieval
    Piesch, Tim-Christian
    Mueller, Henning
    Kuhl, Christiane K.
    Deserno, Thomas M.
    QUALITY OF LIFE THROUGH QUALITY OF INFORMATION, 2012, 180 : 159 - 163
  • [6] Reinforcement Learning for Nested Polar Code Construction
    Huang, Lingchen
    Zhang, Huazi
    Li, Rong
    Ge, Yiqun
    Wang, Jun
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [7] Prevalence of Code Smells in Reinforcement Learning Projects
    Cardozo, Nicolas
    Dusparic, Ivana
    Cabrera, Christian
    2023 IEEE/ACM 2ND INTERNATIONAL CONFERENCE ON AI ENGINEERING - SOFTWARE ENGINEERING FOR AI, CAIN, 2023, : 37 - 42
  • [8] Learning to Code: Coded Caching via Deep Reinforcement Learning
    Naderializadeh, Navid
    Asghari, Seyed Mohammad
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1774 - 1778
  • [9] Retrieval on Source Code: A Neural Code Search
    Sachdev, Saksham
    Li, Hongyu
    Luan, Sifei
    Kim, Seohyun
    Sen, Koushik
    Chandra, Satish
    MAPL'18: PROCEEDINGS OF THE 2ND ACM SIGPLAN INTERNATIONAL WORKSHOP ON MACHINE LEARNING AND PROGRAMMING LANGUAGES, 2018, : 31 - 41
  • [10] Automating Reinforcement Learning Architecture Design for Code Optimization
    Wang, Huanting
    Tang, Zhanyong
    Zhang, Cheng
    Zhao, Jiaqi
    Cummins, Chris
    Leather, Hugh
    Wang, Zheng
    CC'22: PROCEEDINGS OF THE 31ST ACM SIGPLAN INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION, 2022, : 129 - 143