VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits

被引:155
|
作者
Perl, Henning [1 ]
Dechand, Sergej [2 ]
Smith, Matthew [1 ,2 ]
Arp, Daniel [3 ]
Yamaguchi, Fabian [3 ]
Rieck, Konrad [3 ]
Fahl, Sascha [4 ]
Acar, Yasemin [4 ]
机构
[1] Fraunhofer FKIE, Wachtberg, Germany
[2] Univ Bonn, Bonn, Germany
[3] Univ Gottingen, Gottingen, Germany
[4] Saarland Univ, Saarbrucken, Germany
关键词
Vulnerabilities; Static Analysis; Machine Learning;
D O I
10.1145/2810103.2813604
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the security community's best effort, the number of serious vulnerabilities discovered in software is increasing rapidly. In theory, security audits should find and remove the vulnerabilities before the code ever gets deployed. However, due to the enormous amount of code being produced, as well as a the lack of manpower and expertise, not all code is sufficiently audited. Thus, many vulnerabilities slip into production systems. A best-practice approach is to use a code metric analysis tool, such as Flawfinder, to flag potentially dangerous code so that it can receive special attention. However, because these tools have a very high false-positive rate, the manual effort needed to find vulnerabilities remains overwhelming. In this paper, we present a new method of finding potentially dangerous code in code repositories with a significantly lower false-positive rate than comparable systems. We combine code-metric analysis with metadata gathered from code repositories to help code review teams prioritize their work. The paper makes three contributions. First, we conducted the first large-scale mapping of CVEs to GitHub commits in order to create a vulnerable commit database. Second, based on this database, we trained a SVM classifier to flag suspicious commits Compared to Flawfinder, our approach reduces the amount of false alarms by over 99 % at the same level of recall. Finally, we present a thorough quantitative and qualitative analysis of our approach and discuss lessons learned from the results. We will share the database as a benchmark for future research and will also provide our analysis tool as a web service.
引用
收藏
页码:426 / 437
页数:12
相关论文
共 50 条
  • [1] On the nature of code cloning in open-source Java projects
    Golubev, Yaroslav
    Bryksin, Timofey
    arXiv, 2021,
  • [2] Finding a Needle in a Haystack: Threat Analysis in Open-Source Projects
    Gruner, Bernd
    Heckner, Sebastian Thomas
    Sonnekalb, Tim
    Bouhlal, Badr-Eddine
    Brust, Clemens-Alexander
    2024 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING-COMPANION, SANER-C 2024, 2024, : 141 - 145
  • [3] On the Nature of Code Cloning in Open-Source Java']Java Projects
    Golubev, Yaroslav
    Bryksin, Timofey
    2021 IEEE 15TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES, IWSC 2021, 2021, : 22 - 28
  • [4] FOUNTAIN: A JAVA open-source package to assist large sequencing projects
    Jean-Marie Buerstedde
    Florian Prill
    BMC Bioinformatics, 2
  • [5] Investigation of the Software Code Vulnerabilities' Impact on the Popularity of Open Source Software Projects
    Singh, Madanjit
    Saini, Munish
    Kaur, Manevpreet
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2021, 14 (03) : 58 - 69
  • [6] An investigation of misunderstanding code patterns in C open-source software projects
    Flávio Medeiros
    Gabriel Lima
    Guilherme Amaral
    Sven Apel
    Christian Kästner
    Márcio Ribeiro
    Rohit Gheyi
    Empirical Software Engineering, 2019, 24 : 1693 - 1726
  • [7] An investigation of misunderstanding code patterns in C open-source software projects
    Medeiros, Flavio
    Lima, Gabriel
    Amaral, Guilherme
    Apel, Sven
    Kastner, Christian
    Ribeiro, Marcio
    Gheyi, Rohit
    EMPIRICAL SOFTWARE ENGINEERING, 2019, 24 (04) : 1693 - 1726
  • [8] FOUNTAIN: A JAVA']JAVA open-source package to assist large sequencing projects
    Buerstedde, Jean-Marie
    Prill, Florian
    BMC BIOINFORMATICS, 2001, 2 (1)
  • [9] Detecting Vulnerabilities Using Open-Source Intelligence
    Balaji, S. Jai
    Karmel, A.
    HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 530 - 540
  • [10] Decoding Code Quality: A Software Metric Analysis of Open-Source JavaScript Projects
    Mohammad, Suzad
    Al Jobair, Abdullah
    Abedeen, Iftekharul
    International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE - Proceedings, 2024, : 63 - 74