Large-Scale Empirical Study of Important Features Indicative of Discovered Vulnerabilities to Assess Application Security

被引:21
|
作者
Zhang, Mengyuan [1 ,2 ]
de Carnavalet, Xavier de Carne [1 ]
Wang, Lingyu [1 ]
Ragab, Ahmed [3 ,4 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada
[2] Ericsson Res, Montreal, PQ H4S 0B6, Canada
[3] Ecole Polytech Montreal, Math & Ind Engn Dept, Montreal, PQ H3C 3A7, Canada
[4] Menoufia Univ, Fac Elect Engn, Dept Ind Elect & Control Engn, Menoufia 32952, Egypt
基金
加拿大自然科学与工程研究理事会;
关键词
Software vulnerability analysis; vulnerability discovery model; software security; machine learning; COMPLEXITY;
D O I
10.1109/TIFS.2019.2895963
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Existing research on vulnerability discovery models shows that the existence of vulnerabilities inside an application may be linked to certain features, e.g., size or complexity, of that application. However, the applicability of such features to demonstrate the relative security between two applications is not well studied, which may depend on multiple factors in a complex way. In this paper, we perform the first large-scale empirical study of the correlation between various features of applications and the abundance of vulnerabilities. Unlike existing work, which typically focuses on one particular application, resulting in limited successes, we focus on the more realistic issue of assessing the relative security level among different applications. To the best of our knowledge, this is the most comprehensive study of 780 real-world applications involving 6498 vulnerabilities. We apply seven feature selection methods to nine feature subsets selected among 34 collected features, which are then fed into six types of machine learning models, producing 523 estimations. The predictive power of important features is evaluated using four different performance measures. This paper reflects that the complexity of applications is not the only factor in vulnerability discovery and the human-related factors contribute to explaining the number of discovered vulnerabilities in an application.
引用
收藏
页码:2315 / 2330
页数:16
相关论文
共 50 条
  • [21] A Large-Scale Empirical Study of Compiler Errors in Continuous Integration
    Zhang, Chen
    Chen, Bihuan
    Chen, Linlin
    Peng, Xin
    Zhao, Wenyun
    [J]. ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 176 - 187
  • [22] A Large-Scale Empirical Study of Aligned Time Series Forecasting
    Pilyugina, Polina
    Medvedeva, Svetlana
    Mosievich, Kirill
    Trofimov, Ilya
    Kostromina, Alina
    Simakov, Dmitry
    Burnaev, Evgeny
    [J]. IEEE ACCESS, 2024, 12 : 131100 - 131121
  • [23] Software testing and Android applications: a large-scale empirical study
    Pecorelli, Fabiano
    Catolino, Gemma
    Ferrucci, Filomena
    De Lucia, Andrea
    Palomba, Fabio
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (02)
  • [24] A large-scale empirical study of code smells in JavaScript projects
    David Johannes
    Foutse Khomh
    Giuliano Antoniol
    [J]. Software Quality Journal, 2019, 27 : 1271 - 1314
  • [25] Software testing and Android applications: a large-scale empirical study
    Fabiano Pecorelli
    Gemma Catolino
    Filomena Ferrucci
    Andrea De Lucia
    Fabio Palomba
    [J]. Empirical Software Engineering, 2022, 27
  • [26] A Large-Scale Empirical Study on Code-Comment Inconsistencies
    Wen, Fengcai
    Nagy, Csaba
    Bavota, Gabriele
    Lanza, Michele
    [J]. 2019 IEEE/ACM 27TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2019), 2019, : 53 - 64
  • [27] A Large-Scale Empirical Study on Semantic Versioning in Golang Ecosystem
    Li, Wenke
    Wu, Feng
    Fu, Cai
    Zhou, Fan
    [J]. Proceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023, 2023, : 1604 - 1614
  • [28] A Large-scale Empirical Study on Linguistic Antipatterns Affecting APIs
    Aghajani, Emad
    Nagy, Csaba
    Bavota, Gabriele
    Lanza, Michele
    [J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, : 25 - 35
  • [29] Understand the Predictability of Wireless Spectrum: A Large-scale Empirical Study
    Song, Chengqi
    Chen, Dawei
    Zhang, Qian
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS - ICC 2010, 2010,
  • [30] Gaming addiction, definition and measurement: A large-scale empirical study
    Spekman, Marloes L. C.
    Konijn, Elly A.
    Roelofsma, Peter H. M. P.
    Griffiths, Mark D.
    [J]. COMPUTERS IN HUMAN BEHAVIOR, 2013, 29 (06) : 2150 - 2155