Large-Scale Empirical Study of Important Features Indicative of Discovered Vulnerabilities to Assess Application Security

被引：21

作者：

Zhang, Mengyuan ^{[1
,2
]}

de Carnavalet, Xavier de Carne ^{[1
]}

Wang, Lingyu ^{[1
]}

Ragab, Ahmed ^{[3
,4
]}

机构：

[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada

[2] Ericsson Res, Montreal, PQ H4S 0B6, Canada

[3] Ecole Polytech Montreal, Math & Ind Engn Dept, Montreal, PQ H3C 3A7, Canada

[4] Menoufia Univ, Fac Elect Engn, Dept Ind Elect & Control Engn, Menoufia 32952, Egypt

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2019年 / 14卷 / 09期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Software vulnerability analysis; vulnerability discovery model; software security; machine learning; COMPLEXITY;

D O I：

10.1109/TIFS.2019.2895963

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Existing research on vulnerability discovery models shows that the existence of vulnerabilities inside an application may be linked to certain features, e.g., size or complexity, of that application. However, the applicability of such features to demonstrate the relative security between two applications is not well studied, which may depend on multiple factors in a complex way. In this paper, we perform the first large-scale empirical study of the correlation between various features of applications and the abundance of vulnerabilities. Unlike existing work, which typically focuses on one particular application, resulting in limited successes, we focus on the more realistic issue of assessing the relative security level among different applications. To the best of our knowledge, this is the most comprehensive study of 780 real-world applications involving 6498 vulnerabilities. We apply seven feature selection methods to nine feature subsets selected among 34 collected features, which are then fed into six types of machine learning models, producing 523 estimations. The predictive power of important features is evaluated using four different performance measures. This paper reflects that the complexity of applications is not the only factor in vulnerability discovery and the human-related factors contribute to explaining the number of discovered vulnerabilities in an application.

引用

页码：2315 / 2330

页数：16

共 50 条

[21] A Large-Scale Empirical Study of Compiler Errors in Continuous Integration
Zhang, Chen
Chen, Bihuan
Chen, Linlin
Peng, Xin
Zhao, Wenyun
[J]. ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 176 - 187
[22] A Large-Scale Empirical Study of Aligned Time Series Forecasting
Pilyugina, Polina
Medvedeva, Svetlana
Mosievich, Kirill
Trofimov, Ilya
Kostromina, Alina
Simakov, Dmitry
Burnaev, Evgeny
[J]. IEEE ACCESS, 2024, 12 : 131100 - 131121
[23] Software testing and Android applications: a large-scale empirical study
Pecorelli, Fabiano
Catolino, Gemma
Ferrucci, Filomena
De Lucia, Andrea
Palomba, Fabio
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (02)
[24] A large-scale empirical study of code smells in JavaScript projects
David Johannes
Foutse Khomh
Giuliano Antoniol
[J]. Software Quality Journal, 2019, 27 : 1271 - 1314
[25] Software testing and Android applications: a large-scale empirical study
Fabiano Pecorelli
Gemma Catolino
Filomena Ferrucci
Andrea De Lucia
Fabio Palomba
[J]. Empirical Software Engineering, 2022, 27
[26] A Large-Scale Empirical Study on Code-Comment Inconsistencies
Wen, Fengcai
Nagy, Csaba
Bavota, Gabriele
Lanza, Michele
[J]. 2019 IEEE/ACM 27TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2019), 2019, : 53 - 64
[27] A Large-Scale Empirical Study on Semantic Versioning in Golang Ecosystem
Li, Wenke
Wu, Feng
Fu, Cai
Zhou, Fan
[J]. Proceedings - 2023 38th IEEE/ACM International Conference on Automated Software Engineering, ASE 2023, 2023, : 1604 - 1614
[28] A Large-scale Empirical Study on Linguistic Antipatterns Affecting APIs
Aghajani, Emad
Nagy, Csaba
Bavota, Gabriele
Lanza, Michele
[J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, : 25 - 35
[29] Understand the Predictability of Wireless Spectrum: A Large-scale Empirical Study
Song, Chengqi
Chen, Dawei
Zhang, Qian
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS - ICC 2010, 2010,
[30] Gaming addiction, definition and measurement: A large-scale empirical study
Spekman, Marloes L. C.
Konijn, Elly A.
Roelofsma, Peter H. M. P.
Griffiths, Mark D.
[J]. COMPUTERS IN HUMAN BEHAVIOR, 2013, 29 (06) : 2150 - 2155

← 1 2 3 4 5 →