Automatically Recommend Code Updates: Are We There Yet?

被引：0

作者：

Liu, Yue ^{[1
]}

Tantithamthavorn, Chakkrit ^{[1
]}

Liu, Yonghui ^{[1
]}

Thongtanunam, Patanamon ^{[2
]}

Li, Li ^{[3
]}

机构：

[1] Monash Univ, Melbourne, Australia

[2] Univ Melbourne, Melbourne, Australia

[3] Beihang Univ, Beijing, Peoples R China

来源：

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY | 2024年 / 33卷 / 08期

基金：

澳大利亚研究理事会;

关键词：

Code updates; neural machine translation;

D O I：

10.1145/3678167

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In recent years, large pre-trained Language Models of Code (CodeLMs) have shown promising results on various software engineering tasks. One such task is automatic code update recommendation, which transforms outdated code snippets into their approved and revised counterparts. Although many CodeLM-based approaches have been proposed, claiming high accuracy, their effectiveness and reliability on real-world code update tasks remain questionable. In this article, we present the first extensive evaluation of state-of-the-art CodeLMs for automatically recommending code updates. We assess their performance on two diverse datasets of paired updated methods, considering factors such as temporal evolution, project specificity, method size, and update complexity. Our results reveal that while CodeLMs exhibit higher performance in settings that ignore temporal information, they struggle in more realistic time-wise scenarios and generalize poorly to new projects. Furthermore, CodeLM performance decreases significantly for larger methods and more complex updates. Furthermore, we observe that many CodeLM-generated "updates" are actually null, especially in time-wise settings, and meaningful edits remain challenging. Our findings highlight the significant gap between the perceived and actual effectiveness of CodeLMs for real-world code update recommendation and emphasize the need for more research on improving their practicality, robustness, and generalizability.

引用

页数：27

共 50 条

[1] Immunotherapy updates in pancreatic cancer: are we there yet?
Gunturu, Krishna Soujanya
Rossi, Gabriela R.
Saif, Muhammad Wasif
THERAPEUTIC ADVANCES IN MEDICAL ONCOLOGY, 2013, 5 (01) : 81 - 89
[2] Code Interpreter for Bioinformatics: Are We There Yet?
Lei Wang
Xijin Ge
Li Liu
Gangqing Hu
Annals of Biomedical Engineering, 2024, 52 : 754 - 756
[3] Code Interpreter for Bioinformatics: Are We There Yet?
Wang, Lei
Ge, Xijin
Liu, Li
Hu, Gangqing
ANNALS OF BIOMEDICAL ENGINEERING, 2024, 52 (04) : 754 - 756
[4] Automatically Assessing Code Understandability: How Far Are We?
Scalabrino, Simone
Bavota, Gabriele
Vendome, Christopher
Linares-Vasquez, Mario
Poshyvanyk, Denys
Oliveto, Rocco
PROCEEDINGS OF THE 2017 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE'17), 2017, : 417 - 427
[5] We Recommend
人力资源, 2002, (12) : 1 - 1
[6] WE BELIEVE - WE RECOMMEND
不详
NURSING OUTLOOK, 1964, 12 (08) : 27 - 27
[7] Are we there yet? Are we there yet?
Conrad, Charles
Malphurs, Ryan
MANAGEMENT COMMUNICATION QUARTERLY, 2008, 22 (01) : 123 - 146
[8] Code Randomization: Haven't We Solved This Problem Yet?
Crane, Stephen
Homescu, Andrei
Larsen, Per
2016 IEEE CYBERSECURITY DEVELOPMENT (IEEE SECDEV 2016), 2016, : 124 - 129
[9] We recommend to prepare
Ugol', 4 (83-84):
[10] Inside We Recommend
世界电信, 2009, (12) : 7 - 7

← 1 2 3 4 5 →