CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

被引：68

作者：

Yao, Ziyu ^{[1
]}

Peddamail, Jayavardhan Reddy ^{[1
]}

Sun, Huan ^{[1
]}

机构：

[1] Ohio State Univ, Columbus, OH 43210 USA

来源：

WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019) | 2019年

关键词：

Code Annotation; Code Retrieval; Reinforcement Learning;

D O I：

10.1145/3308558.3313632

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

To accelerate software development, much research has been performed to help people understand and reuse the huge amount of available code resources. Two important tasks have been widely studied: code retrieval, which aims to retrieve code snippets relevant to a given natural language query from a code base, and code annotation, where the goal is to annotate a code snippet with a natural language description. Despite their advancement in recent years, the two tasks are mostly explored separately. In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called "CoaCor"), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others. To this end, we propose an effective framework based on reinforcement learning, which explicitly encourages the code annotation model to generate annotations that can be used for the retrieval task. Through extensive experiments, we show that code annotations generated by our framework are much more detailed and more useful for code retrieval, and they can further improve the performance of existing code retrieval models significantly.(1)

引用

页码：2203 / 2214

页数：12

共 50 条

[1] AutoAnnotate: Reinforcement Learning based Code Annotation for High Level Synthesis
Shahzad, Hafsah
Sanaullah, Ahmed
Arora, Sanjay
Drepper, Uli
Herbordt, Martin
2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
[2] Reinforcement Learning of Code Search Sessions
Li, Wei
Yan, Shuhan
Shen, Beijun
Chen, Yuting
2019 26TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), 2019, : 458 - 465
[3] Boosting Code Search with Structural Code Annotation
Kong, Xianglong
Chen, Hongyu
Yu, Ming
Zhang, Lixiang
ELECTRONICS, 2022, 11 (19)
[4] Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning
Ye, Wei
Xie, Rui
Zhang, Jinglei
Hu, Tianxiang
Wang, Xiaoyin
Zhang, Shikun
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 2309 - 2319
[5] IRMA Code II: Unique Annotation of Medical Images for Access and Retrieval
Piesch, Tim-Christian
Mueller, Henning
Kuhl, Christiane K.
Deserno, Thomas M.
QUALITY OF LIFE THROUGH QUALITY OF INFORMATION, 2012, 180 : 159 - 163
[6] Reinforcement Learning for Nested Polar Code Construction
Huang, Lingchen
Zhang, Huazi
Li, Rong
Ge, Yiqun
Wang, Jun
2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
[7] Prevalence of Code Smells in Reinforcement Learning Projects
Cardozo, Nicolas
Dusparic, Ivana
Cabrera, Christian
2023 IEEE/ACM 2ND INTERNATIONAL CONFERENCE ON AI ENGINEERING - SOFTWARE ENGINEERING FOR AI, CAIN, 2023, : 37 - 42
[8] Learning to Code: Coded Caching via Deep Reinforcement Learning
Naderializadeh, Navid
Asghari, Seyed Mohammad
CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1774 - 1778
[9] Retrieval on Source Code: A Neural Code Search
Sachdev, Saksham
Li, Hongyu
Luan, Sifei
Kim, Seohyun
Sen, Koushik
Chandra, Satish
MAPL'18: PROCEEDINGS OF THE 2ND ACM SIGPLAN INTERNATIONAL WORKSHOP ON MACHINE LEARNING AND PROGRAMMING LANGUAGES, 2018, : 31 - 41
[10] Automating Reinforcement Learning Architecture Design for Code Optimization
Wang, Huanting
Tang, Zhanyong
Zhang, Cheng
Zhao, Jiaqi
Cummins, Chris
Leather, Hugh
Wang, Zheng
CC'22: PROCEEDINGS OF THE 31ST ACM SIGPLAN INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION, 2022, : 129 - 143

← 1 2 3 4 5 →