BRAN: Reduce Vulnerability Search Space in Large Open Source Repositories by Learning Bug Symptoms

被引:3
|
作者
Meng, Dongyu [1 ]
Guerriero, Michele [2 ]
Machiry, Aravind [3 ]
Aghakhani, Hojjat [1 ]
Bose, Priyanka [1 ]
Continella, Andrea [4 ]
Kruegel, Christopher [1 ]
Vigna, Giovanni [1 ]
机构
[1] UC Santa Barbara, Santa Barbara, CA 93106 USA
[2] Politecn Milan, Milan, Italy
[3] Purdue Univ, W Lafayette, IN 47907 USA
[4] Univ Twente, Enschede, Netherlands
关键词
Static Analysis; Vulnerabilities; Machine Learning; CODE CHURN; SOFTWARE; METRICS; COMPLEXITY; ACCURATE;
D O I
10.1145/3433210.3453115
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Software is continually increasing in size and complexity, and therefore, vulnerability discovery would benefit from techniques that identify potentially vulnerable regions within large code bases, as this allows for easing vulnerability detection by reducing the search space. Previous work has explored the use of conventional code-quality and complexity metrics in highlighting suspicious sections of (source) code. Recently, researchers also proposed to reduce the vulnerability search space by studying code properties with neural networks. However, previous work generally failed in leveraging the rich metadata that is available for long-running, large code repositories. In this paper, we present an approach, named Bran, to reduce the vulnerability search space by combining conventional code metrics with fine-grained repository metadata. Bran locates code sections that are more likely to contain vulnerabilities in large code bases, potentially improving the efficiency of both manual and automatic code audits. In our experiments on four large code bases, Bran successfully highlights potentially vulnerable functions, outperforming several baselines, including state-of-art vulnerability prediction tools. We also assess Bran's effectiveness in assisting automated testing tools. We use Bran to guide syzkaller, a known kernel fuzzer, in fuzzing a recent version of the Linux kernel. The guided fuzzer identifies 26 bugs (10 are zero-day flaws), including arbitrary writes and reads.
引用
收藏
页码:731 / 743
页数:13
相关论文
共 37 条
  • [21] Eclipse vs. Mozilla: A Comparison of Two Large-Scale Open Source Problem Report Repositories
    Banerjee, Sean
    Helmick, Jordan
    Syed, Zahid
    Cukic, Bojan
    2015 IEEE 16TH INTERNATIONAL SYMPOSIUM ON HIGH ASSURANCE SYSTEMS ENGINEERING (HASE), 2015, : 263 - 270
  • [22] VERI: A Large-scale Open-Source Components Vulnerability Detection in IoT Firmware
    Cheng, Yiran
    Yang, Shouguo
    Lang, Zhe
    Shi, Zhiqiang
    Sun, Limin
    COMPUTERS & SECURITY, 2023, 126
  • [23] Demo: Large Scale Analysis on Vulnerability Remediation in Open-source Java']JavaScript Projects
    Bandara, Vinuri
    Rathnayake, Thisura
    Weerasekara, Nipuna
    Elvitigala, Charitha
    Thilakarathna, Kenneth
    Wijesekera, Primal
    De Zoysa, Kasun
    Keppitiyagama, Chamath
    CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 2447 - 2449
  • [24] Understanding the relation between repeat developer interactions and bug resolution times in large open source ecosystems: A multisystem study
    Datta, Subhajit
    Roychoudhuri, Reshma
    Majumder, Subhashis
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2021, 33 (04)
  • [25] Pylogeny: an open-source Python']Python framework for phylogenetic tree reconstruction and search space heuristics
    Safatli, Alexander
    Blouin, Christian
    PEERJ COMPUTER SCIENCE, 2015, 6
  • [26] Assessing the Configuration Space of the Open Source NVDLA Deep Learning Accelerator on a Mainstream MPSoC Platform
    Veronesi, Alessandro
    Bertozzi, Davide
    Krstic, Milos
    VLSI-SOC: DESIGN TRENDS, VLSI-SOC 2020, 2021, 621 : 87 - 112
  • [27] Learning a unified embedding space of web search from large-scale query log
    Bing, Lidong
    Niu, Zheng-Yu
    Li, Piji
    Lam, Wai
    Wang, Haifeng
    KNOWLEDGE-BASED SYSTEMS, 2018, 150 : 38 - 48
  • [28] HDL-ODPRs: A Hybrid Deep Learning Technique Based Optimal Duplication Detection for Pull-Requests in Open-Source Repositories
    Alotaibi, Saud S.
    APPLIED SCIENCES-BASEL, 2022, 12 (24):
  • [29] Exploring the large chemical space in search of thermodynamically stable and mechanically robust MXenes via machine learning
    Park, Jaejung
    Kim, Minseon
    Kim, Heekyu
    Lee, Jaejun
    Lee, Inhyo
    Park, Haesun
    Lee, Anna
    Min, Kyoungmin
    Lee, Seungchul
    PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2024, 26 (14) : 10769 - 10783
  • [30] Focusing on Valid Search Space in Open-World Compositional Zero-Shot Learning by Leveraging Misleading Answers
    Kim, Soohyeong
    Lee, Sangjun
    Choi, Yong Suk
    IEEE ACCESS, 2024, 12 : 165822 - 165830