The Secret Life of Software Vulnerabilities: A Large-Scale Empirical Study

被引：21

作者：

Iannone, Emanuele ^{[1
]}

Guadagni, Roberta ^{[1
]}

Ferrucci, Filomena ^{[1
]}

De Lucia, Andrea ^{[1
]}

Palomba, Fabio ^{[1
]}

机构：

[1] Univ Salerno, Software Engn SeSa Lab, I-84084 Fisciano, Italy

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2023年 / 49卷 / 01期

基金：

瑞士国家科学基金会;

关键词：

Software vulnerabilities; mining software repositories; empirical software engineering; PREDICTION MODELS; CLASSIFICATION; ACCURACY; SMELL;

D O I：

10.1109/TSE.2022.3140868

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Software vulnerabilities are weaknesses in source code that can be potentially exploited to cause loss or harm. While researchers have been devising a number of methods to deal with vulnerabilities, there is still a noticeable lack of knowledge on their software engineering life cycle, for example how vulnerabilities are introduced and removed by developers. This information can be exploited to design more effective methods for vulnerability prevention and detection, as well as to understand the granularity at which these methods should aim. To investigate the life cycle of known software vulnerabilities, we focus on how, when, and under which circumstances the contributions to the introduction of vulnerabilities in software projects are made, as well as how long, and how they are removed. We consider 3,663 vulnerabilities with public patches from the National Vulnerability Database-pertaining to 1,096 open-source software projects on GitHub-and define an eight-step process involving both automated parts (e.g., using a procedure based on the SZZ algorithm to find the vulnerability-contributing commits) and manual analyses (e.g., how vulnerabilities were fixed). The investigated vulnerabilities can be classified in 144 categories, take on average at least 4 contributing commits before being introduced, and half of them remain unfixed for at least more than one year. Most of the contributions are done by developers with high workload, often when doing maintenance activities, and removed mostly with the addition of new source code aiming at implementing further checks on inputs. We conclude by distilling practical implications on how vulnerability detectors should work to assist developers in timely identifying these issues.

引用

页码：44 / 63

页数：20

共 50 条

[21] A Large-Scale Empirical Study of Android App Decompilation
Mauthe, Noah
Kargen, Ulf
Shahmehri, Nahid
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 400 - 410
[22] A Large-Scale Empirical Study of Geotagging Behavior on Twitter
Huang, Binxuan
Carley, Kathleen M.
[J]. PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2019), 2019, : 365 - 373
[23] Quantifying reuse in OSS: A large-scale empirical study
Constantinou, Eleni
Ampatzoglou, Apostolos
Stamelos, Ioannis
[J]. International Journal of Open Source Software and Processes, 2014, 5 (03) : 1 - 19
[24] A Large-Scale Empirical Study on Industrial Fake Apps
Tang, Chongbin
Chen, Sen
Fan, Lingling
Xu, Lihua
Liu, Yang
Tang, Zhushou
Dou, Liang
[J]. 2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), 2019, : 183 - 192
[25] A large-scale empirical exploration on refactoring activities in open source software projects
Vassallo, Carmine
Grano, Giovanni
Palomba, Fabio
Gall, Harald C.
Bacchelli, Alberto
[J]. SCIENCE OF COMPUTER PROGRAMMING, 2019, 180 : 1 - 15
[26] A Large-Scale Empirical Study of Real-Life Performance Issues in Open Source Projects
Zhao, Yutong
Xiao, Lu
Bondi, Andre B.
Chen, Bihuan
Liu, Yang
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (02) : 924 - 946
[27] Hiding secret messages in large-scale graphs
Lee, Daewon
[J]. Expert Systems with Applications, 2025, 264
[28] Scale and Responsiveness in Large-Scale Software Development
Olsson, Helena Holmstrom
Sandberg, Anna Borjesson
Bosch, Jan
Alahyari, Hiva
[J]. IEEE SOFTWARE, 2014, 31 (05) : 87 - 93
[29] Empirical Results on the Study of Software Vulnerabilities (NIER Track)
Wu, Yan
Siy, Harvey
Gandhi, Robin
[J]. 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, : 964 - 967
[30] A Large-Scale Empirical Study on Semantic Versioning in Golang Ecosystem
Li, Wenke
Wu, Feng
Fu, Cai
Zhou, Fan
[J]. 2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1604 - 1614

← 1 2 3 4 5 →