CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

被引:68
|
作者
Yao, Ziyu [1 ]
Peddamail, Jayavardhan Reddy [1 ]
Sun, Huan [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
关键词
Code Annotation; Code Retrieval; Reinforcement Learning;
D O I
10.1145/3308558.3313632
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To accelerate software development, much research has been performed to help people understand and reuse the huge amount of available code resources. Two important tasks have been widely studied: code retrieval, which aims to retrieve code snippets relevant to a given natural language query from a code base, and code annotation, where the goal is to annotate a code snippet with a natural language description. Despite their advancement in recent years, the two tasks are mostly explored separately. In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called "CoaCor"), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others. To this end, we propose an effective framework based on reinforcement learning, which explicitly encourages the code annotation model to generate annotations that can be used for the retrieval task. Through extensive experiments, we show that code annotations generated by our framework are much more detailed and more useful for code retrieval, and they can further improve the performance of existing code retrieval models significantly.(1)
引用
收藏
页码:2203 / 2214
页数:12
相关论文
共 50 条
  • [21] Code Component Retrieval Using Code2Vec
    RamyaSree, B.
    Ramakrishna, Bajjuri
    Harshitha, M., I
    Kavya, Amma
    Reshvanth, Paladugu
    Rao, N. V. Krishna
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 1044 - 1048
  • [22] Towards a Uniform Code Annotation Approach with Configurable Annotation Granularity
    Juhar, Jan
    Vokorokos, Liberios
    2017 IEEE 14TH INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATICS, 2017, : 152 - 157
  • [23] 'Learning the code'
    Kriegel, Leonard
    SEWANEE REVIEW, 2008, 116 (02) : 232 - 245
  • [24] LEARNING CODE
    Maletzky, Barry
    SCIENTIFIC AMERICAN, 2017, 317 (04) : 8 - 8
  • [25] LEARNING THE CODE
    SCRUPSKI, SE
    ELECTRONIC DESIGN, 1987, 35 (15) : 4 - 4
  • [26] Learning binary code via PCA of angle projection for image retrieval
    Yang, Fumeng
    Ye, Zhiqiang
    Wei, Xueqi
    Wu, Congzhong
    LIDAR IMAGING DETECTION AND TARGET RECOGNITION 2017, 2017, 10605
  • [27] ROSF: Leveraging Information Retrieval and Supervised Learning for Recommending Code Snippets
    Jiang, He
    Nie, Liming
    Sun, Zeyi
    Ren, Zhilei
    Kong, Weiqiang
    Zhang, Tao
    Luo, Xiapu
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2019, 12 (01) : 34 - 46
  • [28] ADVERSARIAL HASH-CODE LEARNING FOR REMOTE SENSING IMAGE RETRIEVAL
    Liu, Chao
    Ma, Jingjing
    Tang, Xu
    Zhang, Xiangrong
    Jiao, Licheng
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 4324 - 4327
  • [29] Combating Ambiguity for Hash-Code Learning in Medical Instance Retrieval
    Fang, Jiansheng
    Fu, Huazhu
    Zeng, Dan
    Yan, Xiao
    Yan, Yuguang
    Liu, Jiang
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (10) : 3943 - 3954
  • [30] Quantum error correction for the toric code using deep reinforcement learning
    Andreasson, Philip
    Johansson, Joel
    Liljestrand, Simon
    Granath, Mats
    QUANTUM, 2019, 3