Automatically Recommend Code Updates: Are We There Yet?

被引:0
|
作者
Liu, Yue [1 ]
Tantithamthavorn, Chakkrit [1 ]
Liu, Yonghui [1 ]
Thongtanunam, Patanamon [2 ]
Li, Li [3 ]
机构
[1] Monash Univ, Melbourne, Australia
[2] Univ Melbourne, Melbourne, Australia
[3] Beihang Univ, Beijing, Peoples R China
基金
澳大利亚研究理事会;
关键词
Code updates; neural machine translation;
D O I
10.1145/3678167
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, large pre-trained Language Models of Code (CodeLMs) have shown promising results on various software engineering tasks. One such task is automatic code update recommendation, which transforms outdated code snippets into their approved and revised counterparts. Although many CodeLM-based approaches have been proposed, claiming high accuracy, their effectiveness and reliability on real-world code update tasks remain questionable. In this article, we present the first extensive evaluation of state-of-the-art CodeLMs for automatically recommending code updates. We assess their performance on two diverse datasets of paired updated methods, considering factors such as temporal evolution, project specificity, method size, and update complexity. Our results reveal that while CodeLMs exhibit higher performance in settings that ignore temporal information, they struggle in more realistic time-wise scenarios and generalize poorly to new projects. Furthermore, CodeLM performance decreases significantly for larger methods and more complex updates. Furthermore, we observe that many CodeLM-generated "updates" are actually null, especially in time-wise settings, and meaningful edits remain challenging. Our findings highlight the significant gap between the perceived and actual effectiveness of CodeLMs for real-world code update recommendation and emphasize the need for more research on improving their practicality, robustness, and generalizability.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Immunotherapy updates in pancreatic cancer: are we there yet?
    Gunturu, Krishna Soujanya
    Rossi, Gabriela R.
    Saif, Muhammad Wasif
    THERAPEUTIC ADVANCES IN MEDICAL ONCOLOGY, 2013, 5 (01) : 81 - 89
  • [2] Code Interpreter for Bioinformatics: Are We There Yet?
    Lei Wang
    Xijin Ge
    Li Liu
    Gangqing Hu
    Annals of Biomedical Engineering, 2024, 52 : 754 - 756
  • [3] Code Interpreter for Bioinformatics: Are We There Yet?
    Wang, Lei
    Ge, Xijin
    Liu, Li
    Hu, Gangqing
    ANNALS OF BIOMEDICAL ENGINEERING, 2024, 52 (04) : 754 - 756
  • [4] Automatically Assessing Code Understandability: How Far Are We?
    Scalabrino, Simone
    Bavota, Gabriele
    Vendome, Christopher
    Linares-Vasquez, Mario
    Poshyvanyk, Denys
    Oliveto, Rocco
    PROCEEDINGS OF THE 2017 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE'17), 2017, : 417 - 427
  • [6] WE BELIEVE - WE RECOMMEND
    不详
    NURSING OUTLOOK, 1964, 12 (08) : 27 - 27
  • [7] Are we there yet? Are we there yet?
    Conrad, Charles
    Malphurs, Ryan
    MANAGEMENT COMMUNICATION QUARTERLY, 2008, 22 (01) : 123 - 146
  • [8] Code Randomization: Haven't We Solved This Problem Yet?
    Crane, Stephen
    Homescu, Andrei
    Larsen, Per
    2016 IEEE CYBERSECURITY DEVELOPMENT (IEEE SECDEV 2016), 2016, : 124 - 129
  • [9] We recommend to prepare
    Ugol', 4 (83-84):