Enhanced automated code vulnerability repair using large language models

被引：2

作者：

de-Fitero-Dominguez, David ^{[1
]}

Garcia-Lopez, Eva ^{[1
]}

Garcia-Cabot, Antonio ^{[1
]}

Martinez-Herraiz, Jose-Javier ^{[1
]}

机构：

[1] Univ Alcala, Dept Ciencias Computac, Alcala De Henares 28805, Spain

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2024年 / 138卷

关键词：

Automated code repair; Deep learning; Large language models; Vulnerability repair; Mistral; Code llama;

D O I：

10.1016/j.engappai.2024.109291

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This research addresses the complex challenge of automated repair of code vulnerabilities, vital for enhancing digital security in an increasingly technology-driven world. The study introduces a novel and efficient format for the representation of code modification, using advanced Large Language Models (LLMs) such as Code Llama and Mistral. These models, fine-tuned on datasets featuring C/C++ code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques. A key finding is the enhanced repair accuracy of these models when compared to previous methods such as VulRepair, which underscores their practical utility and efficiency. The research also offers a critical assessment of current evaluation metrics, such as "Perfect Predictions", and their limitations in reflecting the true capabilities of automated repair models in real-world scenarios. Following this, it underscores the importance of using test datasets devoid of train samples, emphasizing the need for dataset integrity to enhance the effectiveness of LLMs in code repair tasks. The significance of this work is its contribution to digital security, setting new standards for automated code vulnerability repair and paving the way for future advancements in the fields of cybersecurity and artificial intelligence. The study does not only highlight the potential of LLMs in enhancing code security but also fosters further exploration and research in these crucial areas.

引用

页数：13

共 50 条

[1] Impact of Code Language Models on Automated Program Repair
Jiang, Nan
Liu, Kevin
Lutellier, Thibaud
Tan, Lin
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1430 - 1442
[2] Large Language Models for Automated Program Repair
Ribeiro, Francisco
COMPANION PROCEEDINGS OF THE 2023 ACM SIGPLAN INTERNATIONAL CONFERENCE ON SYSTEMS, PROGRAMMING, LANGUAGES, AND APPLICATIONS: SOFTWARE FOR HUMANITY, SPLASH COMPANION 2023, 2023, : 7 - 9
[3] Large Language Models for Automated Program Repair
Ribeiro, Francisco
SPLASH Companion 2023 - Companion Proceedings of the 2023 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity, 2023, : 7 - 9
[4] An Empirical Study on Fine-tuning Large Language Models of Code for Automated Program Repair
Huang, Kai
Meng, Xiangxin
Zhang, Jian
Liu, Yang
Wang, Wenjie
Li, Shuhao
Zhang, Yuqing
2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1162 - 1174
[5] Automated Repair of Programs from Large Language Models
National University of Singapore, Singapore
不详
不详
arXiv, 1600,
[6] Automated Repair of Programs from Large Language Models
Fan, Zhiyu
Gao, Xiang
Mirchev, Martin
Roychoudhury, Abhik
Tan, Shin Hwei
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1469 - 1481
[7] A Study of Vulnerability Repair in JavaScript Programs with Large Language Models
Le, Tan Khang
Alimadadi, Saba
Ko, Steven Y.
arXiv,
[8] Evaluating Large Language Models for Real-World Vulnerability Repair in C/C plus plus Code
Zhang, Lan
Zou, Qingtian
Singhal, Anoop
Sun, Xiaoyan
Liu, Peng
PROCEEDINGS OF THE 10TH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, IWSPA 2024, 2024, : 49 - 58
[9] Large Language Models in Automated Repair of Haskell Type Errors
Santos, Sofia
Saraiva, Joao
Ribeiro, Francisco
2024 ACM/IEEE INTERNATIONAL WORKSHOP ON AUTOMATED PROGRAM REPAIR, APR 2024, 2024, : 42 - 45
[10] Automated Smart Contract Vulnerability Detection using Fine-tuned Large Language Models
Yang, Zhiju
Man, Gaoyuan
Yue, Songqing
6TH INTERNATIONAL CONFERENCE ON BLOCKCHAIN TECHNOLOGY AND APPLICATIONS, ICBTA 2023, 2023, : 19 - 23

← 1 2 3 4 5 →