Improving Automatic Source Code Summarization via Deep Reinforcement Learning

被引：236

作者：

Wan, Yao ^{[1
]}

Zhao, Zhou ^{[2
]}

Yang, Min ^{[3
]}

Xu, Guandong

Ying, Haochao ^{[1
]}

Wu, Jian ^{[1
]}

Yu, Philip S. ^{[4
,5
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China

[3] Univ Technol Sydney, Adv Analyt Inst, Sydney, NSW, Australia

[4] Univ Illinois, Chicago, IL USA

[5] Tsinghua Univ, Inst Data Sci, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18) | 2018年

关键词：

dCode summarization; comment generation; deep learning; reinforcement learning; GO;

D O I：

10.1145/3238147.3238206

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Code summarization provides a high level natural language description of the function performed by code, as it can benefit the software maintenance, code categorization and retrieval. To the best of our knowledge, most state-of-the-art approaches follow an encoder-decoder framework which encodes the code into a hidden space and then decode it into natural language space, suffering from two major drawbacks: a) Their encoders only consider the sequential content of code, ignoring the tree structure which is also critical for the task of code summarization; b) Their decoders are typically trained to predict the next word by maximizing the likelihood of next ground-truth word with previous ground-truth word given. However, it is expected to generate the entire sequence from scratch at test time. This discrepancy can cause an exposure bias issue, making the learnt decoder suboptimal. In this paper, we incorporate an abstract syntax tree structure as well as sequential content of code snippets into a deep reinforcement learning framework (i.e., actor-critic network). The actor network provides the confidence of predicting the next word according to current state. On the other hand, the critic network evaluates the reward value of all possible extensions of the current state and can provide global guidance for explorations. We employ an advantage reward composed of BLEU metric to train both networks. Comprehensive experiments on a real-world dataset show the effectiveness of our proposed model when compared with some state-of-the-art methods.

引用

页码：397 / 407

页数：11

共 50 条

[1] Automatic Documentation Generation via Source Code Summarization
McBurney, Paul W.
[J]. 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, 2015, : 903 - 906
[2] Automatic Text Summarization Using Deep Reinforcement Learning and Beyond
Sun, Gang
Wang, Zhongxin
Zhao, Jia
[J]. INFORMATION TECHNOLOGY AND CONTROL, 2021, 50 (03): : 458 - 469
[3] A Survey of Automatic Source Code Summarization
Zhang, Chunyan
Wang, Junchao
Zhou, Qinglei
Xu, Ting
Tang, Ke
Gui, Hairen
Liu, Fudong
[J]. SYMMETRY-BASEL, 2022, 14 (03):
[4] Reinforcement-Learning-Guided Source Code Summarization Using Hierarchical Attention
Wang, Wenhua
Zhang, Yuqun
Sui, Yulei
Wan, Yao
Zhao, Zhou
Wu, Jian
Yu, Philip S.
Xu, Guandong
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (01) : 102 - 119
[5] Improving Deep Reinforcement Learning via Transfer
Du, Yunshu
[J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2405 - 2407
[6] Learning to Code: Coded Caching via Deep Reinforcement Learning
Naderializadeh, Navid
Asghari, Seyed Mohammad
[J]. CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1774 - 1778
[7] Interactive Query-Assisted Summarization via Deep Reinforcement Learning
Shapira, Ori
Pasunuru, Ramakanth
Bansal, Mohit
Dagan, Ido
Amsterdamer, Yael
[J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2551 - 2568
[8] Automatic Bug Triaging via Deep Reinforcement Learning
Liu, Yong
Qi, Xuexin
Zhang, Jiali
Li, Hui
Ge, Xin
Ai, Jun
[J]. APPLIED SCIENCES-BASEL, 2022, 12 (07):
[9] Automatic source code summarization with graph attention networks
Zhou, Yu
Shen, Juanjuan
Zhang, Xiaoqing
Yang, Wenhua
Han, Tingting
Chen, Taolue
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2022, 188
[10] Improving Automated Source Code Summarization via an Eye-Tracking Study of Programmers
Rodeghero, Paige
McMillan, Collin
McBurney, Paul W.
Bosch, Nigel
D'Mello, Sidney
[J]. 36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, : 390 - 401

← 1 2 3 4 5 →