Enhanced automated code vulnerability repair using large language models

被引:2
|
作者
de-Fitero-Dominguez, David [1 ]
Garcia-Lopez, Eva [1 ]
Garcia-Cabot, Antonio [1 ]
Martinez-Herraiz, Jose-Javier [1 ]
机构
[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain
关键词
Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;
D O I
10.1016/j.engappai.2024.109291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Evaluating Impact of Conventional Code Analysis Against Large Language Models in API Vulnerability Detection
    Yildirim, Recep
    Aydin, Kerem
    Cetin, Orcun
    PROCEEDINGS OF THE 2024 EUROPEAN INTERDISCIPLINARY CYBERSECURITY CONFERENCE, EICC 2024, 2024, : 57 - 64
  • [32] Advanced Smart Contract Vulnerability Detection using Large Language Models
    Erfan, Fatemeh
    Yahyatabar, Mohammad
    Bellaiche, Martine
    Halabi, Talal
    2024 8TH CYBER SECURITY IN NETWORKING CONFERENCE, CSNET, 2024, : 289 - 296
  • [33] Training Language Models for Programming Feedback Using Automated Repair Tools
    Koutcheme, Charles
    ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2023, 2023, 13916 : 830 - 835
  • [34] Automated Large Program Repair based on Big Code
    Hoang Van Thuy
    Phan Viet Anh
    Nguyen Xuan Hoai
    PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY (SOICT 2018), 2018, : 375 - 381
  • [35] Finetuning Large Language Models for Vulnerability Detection
    Shestov, Aleksei
    Levichev, Rodion
    Mussabayev, Ravil
    Maslov, Evgeny
    Zadorozhny, Pavel
    Cheshkov, Anton
    Mussabayev, Rustam
    Toleu, Alymzhan
    Tolegen, Gulmira
    Krassovitskiy, Alexander
    IEEE ACCESS, 2025, 13 : 38889 - 38900
  • [36] Automatic Unit Test Code Generation Using Large Language Models
    Ocal, Akdeniz Kutay
    Keskinoz, Mehmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [37] Multilingual Code Co-evolution using Large Language Models
    Zhang, Jiyang
    Nie, Pengyu
    Li, Junyi Jessy
    Gligoric, Milos
    ESEC/FSE 2023 - Proceedings of the 31st ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023, : 695 - 707
  • [38] Enhancing Network Management Using Code Generated by Large Language Models
    Mani, Sathiya Kumaran
    Zhou, Yajie
    Hsieh, Kevin
    Segarra, Santiago
    Eberl, Trevor
    Azulai, Eliran
    Frizler, Ido
    Chandra, Ranveer
    Kandula, Srikanth
    PROCEEDINGS OF THE 22ND ACM WORKSHOP ON HOT TOPICS IN NETWORKS, HOTNETS 2023, 2023, : 196 - 204
  • [39] Multilingual Code Co-evolution using Large Language Models
    Zhang, Jiyang
    Nie, Pengyu
    Li, Junyi Jessy
    Gligoric, Milos
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 695 - 707
  • [40] Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair
    Wei, Yuxiang
    Xia, Chunqiu Steven
    Zhang, Lingming
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 172 - 184