BRAN: Reduce Vulnerability Search Space in Large Open Source Repositories by Learning Bug Symptoms

被引：3

作者：

Meng, Dongyu ^{[1
]}

Guerriero, Michele ^{[2
]}

Machiry, Aravind ^{[3
]}

Aghakhani, Hojjat ^{[1
]}

Bose, Priyanka ^{[1
]}

Continella, Andrea ^{[4
]}

Kruegel, Christopher ^{[1
]}

Vigna, Giovanni ^{[1
]}

机构：

[1] UC Santa Barbara, Santa Barbara, CA 93106 USA

[2] Politecn Milan, Milan, Italy

[3] Purdue Univ, W Lafayette, IN 47907 USA

[4] Univ Twente, Enschede, Netherlands

来源：

ASIA CCS'21: PROCEEDINGS OF THE 2021 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY | 2021年

关键词：

Static Analysis; Vulnerabilities; Machine Learning; CODE CHURN; SOFTWARE; METRICS; COMPLEXITY; ACCURATE;

D O I：

10.1145/3433210.3453115

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Software is continually increasing in size and complexity, and therefore, vulnerability discovery would benefit from techniques that identify potentially vulnerable regions within large code bases, as this allows for easing vulnerability detection by reducing the search space. Previous work has explored the use of conventional code-quality and complexity metrics in highlighting suspicious sections of (source) code. Recently, researchers also proposed to reduce the vulnerability search space by studying code properties with neural networks. However, previous work generally failed in leveraging the rich metadata that is available for long-running, large code repositories. In this paper, we present an approach, named Bran, to reduce the vulnerability search space by combining conventional code metrics with fine-grained repository metadata. Bran locates code sections that are more likely to contain vulnerabilities in large code bases, potentially improving the efficiency of both manual and automatic code audits. In our experiments on four large code bases, Bran successfully highlights potentially vulnerable functions, outperforming several baselines, including state-of-art vulnerability prediction tools. We also assess Bran's effectiveness in assisting automated testing tools. We use Bran to guide syzkaller, a known kernel fuzzer, in fuzzing a recent version of the Linux kernel. The guided fuzzer identifies 26 bugs (10 are zero-day flaws), including arbitrary writes and reads.

引用

页码：731 / 743

页数：13

共 37 条

[21] Eclipse vs. Mozilla: A Comparison of Two Large-Scale Open Source Problem Report Repositories
Banerjee, Sean
Helmick, Jordan
Syed, Zahid
Cukic, Bojan
2015 IEEE 16TH INTERNATIONAL SYMPOSIUM ON HIGH ASSURANCE SYSTEMS ENGINEERING (HASE), 2015, : 263 - 270
[22] VERI: A Large-scale Open-Source Components Vulnerability Detection in IoT Firmware
Cheng, Yiran
Yang, Shouguo
Lang, Zhe
Shi, Zhiqiang
Sun, Limin
COMPUTERS & SECURITY, 2023, 126
[23] Demo: Large Scale Analysis on Vulnerability Remediation in Open-source Java']JavaScript Projects
Bandara, Vinuri
Rathnayake, Thisura
Weerasekara, Nipuna
Elvitigala, Charitha
Thilakarathna, Kenneth
Wijesekera, Primal
De Zoysa, Kasun
Keppitiyagama, Chamath
CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 2447 - 2449
[24] Understanding the relation between repeat developer interactions and bug resolution times in large open source ecosystems: A multisystem study
Datta, Subhajit
Roychoudhuri, Reshma
Majumder, Subhashis
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2021, 33 (04)
[25] Pylogeny: an open-source Python']Python framework for phylogenetic tree reconstruction and search space heuristics
Safatli, Alexander
Blouin, Christian
PEERJ COMPUTER SCIENCE, 2015, 6
[26] Assessing the Configuration Space of the Open Source NVDLA Deep Learning Accelerator on a Mainstream MPSoC Platform
Veronesi, Alessandro
Bertozzi, Davide
Krstic, Milos
VLSI-SOC: DESIGN TRENDS, VLSI-SOC 2020, 2021, 621 : 87 - 112
[27] Learning a unified embedding space of web search from large-scale query log
Bing, Lidong
Niu, Zheng-Yu
Li, Piji
Lam, Wai
Wang, Haifeng
KNOWLEDGE-BASED SYSTEMS, 2018, 150 : 38 - 48
[28] HDL-ODPRs: A Hybrid Deep Learning Technique Based Optimal Duplication Detection for Pull-Requests in Open-Source Repositories
Alotaibi, Saud S.
APPLIED SCIENCES-BASEL, 2022, 12 (24):
[29] Exploring the large chemical space in search of thermodynamically stable and mechanically robust MXenes via machine learning
Park, Jaejung
Kim, Minseon
Kim, Heekyu
Lee, Jaejun
Lee, Inhyo
Park, Haesun
Lee, Anna
Min, Kyoungmin
Lee, Seungchul
PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2024, 26 (14) : 10769 - 10783
[30] Focusing on Valid Search Space in Open-World Compositional Zero-Shot Learning by Leveraging Misleading Answers
Kim, Soohyeong
Lee, Sangjun
Choi, Yong Suk
IEEE ACCESS, 2024, 12 : 165822 - 165830

← 1 2 3 4 →