BRAN: Reduce Vulnerability Search Space in Large Open Source Repositories by Learning Bug Symptoms

被引:3
|
作者
Meng, Dongyu [1 ]
Guerriero, Michele [2 ]
Machiry, Aravind [3 ]
Aghakhani, Hojjat [1 ]
Bose, Priyanka [1 ]
Continella, Andrea [4 ]
Kruegel, Christopher [1 ]
Vigna, Giovanni [1 ]
机构
[1] UC Santa Barbara, Santa Barbara, CA 93106 USA
[2] Politecn Milan, Milan, Italy
[3] Purdue Univ, W Lafayette, IN 47907 USA
[4] Univ Twente, Enschede, Netherlands
关键词
Static Analysis; Vulnerabilities; Machine Learning; CODE CHURN; SOFTWARE; METRICS; COMPLEXITY; ACCURATE;
D O I
10.1145/3433210.3453115
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Software is continually increasing in size and complexity, and therefore, vulnerability discovery would benefit from techniques that identify potentially vulnerable regions within large code bases, as this allows for easing vulnerability detection by reducing the search space. Previous work has explored the use of conventional code-quality and complexity metrics in highlighting suspicious sections of (source) code. Recently, researchers also proposed to reduce the vulnerability search space by studying code properties with neural networks. However, previous work generally failed in leveraging the rich metadata that is available for long-running, large code repositories. In this paper, we present an approach, named Bran, to reduce the vulnerability search space by combining conventional code metrics with fine-grained repository metadata. Bran locates code sections that are more likely to contain vulnerabilities in large code bases, potentially improving the efficiency of both manual and automatic code audits. In our experiments on four large code bases, Bran successfully highlights potentially vulnerable functions, outperforming several baselines, including state-of-art vulnerability prediction tools. We also assess Bran's effectiveness in assisting automated testing tools. We use Bran to guide syzkaller, a known kernel fuzzer, in fuzzing a recent version of the Linux kernel. The guided fuzzer identifies 26 bugs (10 are zero-day flaws), including arbitrary writes and reads.
引用
收藏
页码:731 / 743
页数:13
相关论文
共 37 条
  • [1] Performance Assessment of Bug Fixing Process in Open Source Repositories
    Goyal, Anjali
    Sardana, Neetu
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 2070 - 2079
  • [2] DENATURE: duplicate detection and type identification in open source bug repositories
    Chauhan, Ruby
    Sharma, Shakshi
    Goyal, Anjali
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2023, 14 (SUPPL 1) : S275 - S292
  • [3] DENATURE: duplicate detection and type identification in open source bug repositories
    Ruby Chauhan
    Shakshi Sharma
    Anjali Goyal
    International Journal of System Assurance Engineering and Management, 2023, 14 : 275 - 292
  • [4] Evaluating the Data Inconsistency of Open-Source Vulnerability Repositories
    Jiang, Yuning
    Jeusfeld, Manfred
    Ding, Jianguo
    ARES 2021: 16TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, 2021,
  • [5] On the cost of mining very large open source repositories
    Banerjee, Sean
    Cukic, Bojan
    2015 IEEE/ACM 1ST INTERNATIONAL WORKSHOP ON BIG DATA SOFTWARE ENGINEERING, 2015, : 37 - 43
  • [6] Detecting code vulnerabilities by learning from large-scale open source repositories
    Xu, Rongze
    Tang, Zhanyong
    Ye, Guixin
    Wang, Huanting
    Ke, Xin
    Fang, Dingyi
    Wang, Zheng
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2022, 69
  • [7] Development of a code clone search tool for open source repositories
    Xia, Pei
    Manabe, Yuki
    Yoshida, Norihiro
    Inoue, Katsuro
    Computer Software, 2012, 29 (03): : 181 - 187
  • [8] Large-Scale Identification and Analysis of Factors Impacting Simple Bug Resolution Times in Open Source Software Repositories
    Eiroa-Lledo, Elia
    Ali, Rao Hamza
    Pinto, Gabriela
    Anderson, Jillian
    Linstead, Erik
    APPLIED SCIENCES-BASEL, 2023, 13 (05):
  • [9] SPBC: A self-paced learning model for bug classification from historical repositories of open-source software
    Mohsin, Hufsa
    Shi, Chongyang
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 167
  • [10] Automated Mapping of Vulnerability Advisories onto their Fix Commits in Open Source Repositories
    Hommersom, Daan
    Sabetta, Antonino
    Coppola, Bonaventura
    Di Nucci, Dario
    Tamburri, Damian A.
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (05)