Assemble Foundation Models for Automatic Code Summarization

被引：12

作者：

Gu, Jian ^{[1
]}

Salza, Pasquale ^{[1
]}

Gall, Harald C. ^{[1
]}

机构：

[1] Univ Zurich, Zurich, Switzerland

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022) | 2022年

关键词：

transfer learning; adaptive scheme; Transformer; Gaussian noise; code summarization;

D O I：

10.1109/SANER53432.2022.00112

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Automatic code summarization is beneficial to software development and maintenance since it reduces the burden of manual tasks. Currently, artificial intelligence is undergoing a paradigm shift. The foundation models pretrained on massive data and finetuned to downstream tasks surpass specially customized models. This trend inspired us to consider reusing foundation models instead of learning from scratch. Based on this, we propose a flexible and robust approach for automatic code summarization based on neural networks. We assemble available foundation models, such as CodeBERT and GPT-2, into a single model named AdaMo. Moreover, we utilize Gaussian noise as the simulation of contextual information to optimize the latent representation. Furthermore, we introduce two adaptive schemes from the perspective of knowledge transfer, namely continuous pretraining and intermediate finetuning, and design intermediate stage tasks for general sequence-to-sequence learning. Finally, we evaluate AdaMo against a benchmark dataset for code summarization, by comparing it with state-of-the-art models.

引用

页码：935 / 946

页数：12

共 50 条

[1] A Survey of Automatic Source Code Summarization
Zhang, Chunyan
Wang, Junchao
Zhou, Qinglei
Xu, Ting
Tang, Ke
Gui, Hairen
Liu, Fudong
[J]. SYMMETRY-BASEL, 2022, 14 (03):
[2] Statistical Models to Automatic Text Summarization
Pham Trong Nguyen
Co Ton Minh Dang
[J]. FUTURE DATA AND SECURITY ENGINEERING, FDSE 2018, 2018, 11251 : 486 - 498
[3] Adversarial training and ensemble learning for automatic code summarization
Ziyi Zhou
Huiqun Yu
Guisheng Fan
[J]. Neural Computing and Applications, 2021, 33 : 12571 - 12589
[4] Adversarial training and ensemble learning for automatic code summarization
Zhou, Ziyi
Yu, Huiqun
Fan, Guisheng
[J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (19): : 12571 - 12589
[5] Automatic source code summarization with graph attention networks
Zhou, Yu
Shen, Juanjuan
Zhang, Xiaoqing
Yang, Wenhua
Han, Tingting
Chen, Taolue
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2022, 188
[6] Automatic Documentation Generation via Source Code Summarization
McBurney, Paul W.
[J]. 2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, 2015, : 903 - 906
[7] Reassessing Automatic Evaluation Metrics for Code Summarization Tasks
Roy, Devjeet
Fakhoury, Sarah
Arnaoudova, Venera
[J]. PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), 2021, : 1105 - 1116
[8] Automatic Source Code Summarization with Extended Tree-LSTM
Shido, Yusuke
Kobayashi, Yasuaki
Yamamoto, Akihiro
Miyamoto, Atsushi
Matsumura, Tadayuki
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[9] Leveraging and Evaluating Automatic Code Summarization for JPA Program Comprehension
Mayer, Richard
Moser, Michael
Geist, Verena
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 768 - 772
[10] Automatic Source Code Summarization of Context for Java']Java Methods
McBurney, Paul W.
McMillan, Collin
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (02) : 103 - 119

← 1 2 3 4 5 →