The Secret Life of Software Vulnerabilities: A Large-Scale Empirical Study

被引:21
|
作者
Iannone, Emanuele [1 ]
Guadagni, Roberta [1 ]
Ferrucci, Filomena [1 ]
De Lucia, Andrea [1 ]
Palomba, Fabio [1 ]
机构
[1] Univ Salerno, Software Engn SeSa Lab, I-84084 Fisciano, Italy
基金
瑞士国家科学基金会;
关键词
Software vulnerabilities; mining software repositories; empirical software engineering; PREDICTION MODELS; CLASSIFICATION; ACCURACY; SMELL;
D O I
10.1109/TSE.2022.3140868
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software vulnerabilities are weaknesses in source code that can be potentially exploited to cause loss or harm. While researchers have been devising a number of methods to deal with vulnerabilities, there is still a noticeable lack of knowledge on their software engineering life cycle, for example how vulnerabilities are introduced and removed by developers. This information can be exploited to design more effective methods for vulnerability prevention and detection, as well as to understand the granularity at which these methods should aim. To investigate the life cycle of known software vulnerabilities, we focus on how, when, and under which circumstances the contributions to the introduction of vulnerabilities in software projects are made, as well as how long, and how they are removed. We consider 3,663 vulnerabilities with public patches from the National Vulnerability Database-pertaining to 1,096 open-source software projects on GitHub-and define an eight-step process involving both automated parts (e.g., using a procedure based on the SZZ algorithm to find the vulnerability-contributing commits) and manual analyses (e.g., how vulnerabilities were fixed). The investigated vulnerabilities can be classified in 144 categories, take on average at least 4 contributing commits before being introduced, and half of them remain unfixed for at least more than one year. Most of the contributions are done by developers with high workload, often when doing maintenance activities, and removed mostly with the addition of new source code aiming at implementing further checks on inputs. We conclude by distilling practical implications on how vulnerability detectors should work to assist developers in timely identifying these issues.
引用
收藏
页码:44 / 63
页数:20
相关论文
共 50 条
  • [21] A Large-Scale Empirical Study of Android App Decompilation
    Mauthe, Noah
    Kargen, Ulf
    Shahmehri, Nahid
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 400 - 410
  • [22] A Large-Scale Empirical Study of Geotagging Behavior on Twitter
    Huang, Binxuan
    Carley, Kathleen M.
    [J]. PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2019), 2019, : 365 - 373
  • [23] Quantifying reuse in OSS: A large-scale empirical study
    Constantinou, Eleni
    Ampatzoglou, Apostolos
    Stamelos, Ioannis
    [J]. International Journal of Open Source Software and Processes, 2014, 5 (03) : 1 - 19
  • [24] A Large-Scale Empirical Study on Industrial Fake Apps
    Tang, Chongbin
    Chen, Sen
    Fan, Lingling
    Xu, Lihua
    Liu, Yang
    Tang, Zhushou
    Dou, Liang
    [J]. 2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), 2019, : 183 - 192
  • [25] A large-scale empirical exploration on refactoring activities in open source software projects
    Vassallo, Carmine
    Grano, Giovanni
    Palomba, Fabio
    Gall, Harald C.
    Bacchelli, Alberto
    [J]. SCIENCE OF COMPUTER PROGRAMMING, 2019, 180 : 1 - 15
  • [26] A Large-Scale Empirical Study of Real-Life Performance Issues in Open Source Projects
    Zhao, Yutong
    Xiao, Lu
    Bondi, Andre B.
    Chen, Bihuan
    Liu, Yang
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (02) : 924 - 946
  • [27] Hiding secret messages in large-scale graphs
    Lee, Daewon
    [J]. Expert Systems with Applications, 2025, 264
  • [28] Scale and Responsiveness in Large-Scale Software Development
    Olsson, Helena Holmstrom
    Sandberg, Anna Borjesson
    Bosch, Jan
    Alahyari, Hiva
    [J]. IEEE SOFTWARE, 2014, 31 (05) : 87 - 93
  • [29] Empirical Results on the Study of Software Vulnerabilities (NIER Track)
    Wu, Yan
    Siy, Harvey
    Gandhi, Robin
    [J]. 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, : 964 - 967
  • [30] A Large-Scale Empirical Study on Semantic Versioning in Golang Ecosystem
    Li, Wenke
    Wu, Feng
    Fu, Cai
    Zhou, Fan
    [J]. 2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1604 - 1614