Evaluating Explanations for Software Patches Generated by Large Language Models

被引:0
|
作者
Sobania, Dominik [1 ]
Geiger, Alina [1 ]
Callan, James [2 ]
Brownlee, Alexander [3 ]
Hanna, Carol [2 ]
Moussa, Rebecca [2 ]
Lopez, Mar Zamorano [2 ]
Petke, Justyna [2 ]
Sarro, Federica [2 ]
机构
[1] Johannes Gutenberg Univ Mainz, Mainz, Germany
[2] UCL, London, England
[3] Univ Stirling, Stirling, Scotland
基金
英国工程与自然科学研究理事会;
关键词
Large Language Models; Software Patches; AI Explainability; Program Repair; Genetic Improvement;
D O I
10.1007/978-3-031-48796-5_12
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Large language models (LLMs) have recently been integrated in a variety of applications including software engineering tasks. In this work, we study the use of LLMs to enhance the explainability of software patches. In particular, we evaluate the performance of GPT 3.5 in explaining patches generated by the search-based automated program repair system ARJA-e for 30 bugs from the popular Defects4J benchmark. We also investigate the performance achieved when explaining the corresponding patches written by software developers. We find that on average 84% of the LLM explanations for machine-generated patches were correct and 54% were complete for the studied categories in at least 1 out of 3 runs. Furthermore, we find that the LLM generates more accurate explanations for machine-generated patches than for human-written ones.
引用
收藏
页码:147 / 152
页数:6
相关论文
共 50 条
  • [41] Evaluating the Ability of Large Language Models to Generate Motivational Feedback
    Gaeta, Angelo
    Orciuoli, Francesco
    Pascuzzo, Antonella
    Peduto, Angela
    [J]. GENERATIVE INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, PT I, ITS 2024, 2024, 14798 : 188 - 201
  • [42] Evaluating the effectiveness of large language models in patient education for conjunctivitis
    Wang, Jingyuan
    Shi, Runhan
    Le, Qihua
    Shan, Kun
    Chen, Zhi
    Zhou, Xujiao
    He, Yao
    Hong, Jiaxu
    [J]. BRITISH JOURNAL OF OPHTHALMOLOGY, 2024,
  • [43] Evaluating Cognitive Maps and planning in Large Language Models with CogEval
    Momennejad, Ida
    Hasanbeig, Hosein
    Frujeri, Felipe Vieira
    Sharma, Hiteshi
    Ness, Robert Osazuwa
    Jojic, Nebojsa
    Palangi, Hamid
    Larson, Jonathan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [44] Evaluating the Efficacy of Large Language Models in Identifying Phishing Attempts
    Patel, Het
    Reiman, Umair
    Iqbal, Farkhund
    [J]. 2024 16TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION, HSI 2024, 2024,
  • [45] Limits of Detecting Text Generated by Large-Scale Language Models
    Varshney, Lav R.
    Keskar, Nitish Shirish
    Socher, Richard
    [J]. 2020 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2020,
  • [46] VARIABILITY AND IMPROVEMENTS OF ANSWERS GENERATED WITH DIFFERENT VERSIONS OF LARGE LANGUAGE MODELS
    Benbow, E.
    Reason, T.
    Malcolm, B.
    Klijn, S.
    Hill, N.
    Teitsson, S.
    [J]. VALUE IN HEALTH, 2024, 27 (06) : S272 - S272
  • [47] Enhancing Network Management Using Code Generated by Large Language Models
    Mani, Sathiya Kumaran
    Zhou, Yajie
    Hsieh, Kevin
    Segarra, Santiago
    Eberl, Trevor
    Azulai, Eliran
    Frizler, Ido
    Chandra, Ranveer
    Kandula, Srikanth
    [J]. PROCEEDINGS OF THE 22ND ACM WORKSHOP ON HOT TOPICS IN NETWORKS, HOTNETS 2023, 2023, : 196 - 204
  • [48] Evaluating the Language Abilities of Large Language Models vs. Humans: Three Caveats
    Leivada, Evelina
    Dentella, Vittoria
    Guenther, Fritz
    [J]. BIOLINGUISTICS, 2024, 18
  • [49] MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models
    Cai, Yan
    Wang, Linlin
    Wang, Ye
    de Melo, Gerard
    Zhang, Ya
    Wang, Yanfeng
    He, Liang
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17709 - 17717
  • [50] Mapping Source Code to Software Architecture by Leveraging Large Language Models
    Johansson, Nils
    Caporuscio, Mauro
    Olsson, Tobias
    [J]. SOFTWARE ARCHITECTURE, ECSA 2024 TRACKS AND WORKSHOPS, 2024, 14937 : 133 - 149