Toward Large-Scale Vulnerability Discovery using Machine Learning

被引：146

作者：

Grieco, Gustavo ^{[1
]}

Grinblat, Guillermo Luis ^{[1
]}

Uzal, Lucas ^{[1
]}

Rawat, Sanjay ^{[2
,4
]}

Feist, Josselin ^{[3
]}

Mounier, Laurent ^{[3
]}

机构：

[1] CIFASIS CONICET, Rosario, Santa Fe, Argentina

[2] Vrije Univ Amsterdam, Syst Secur Grp, Amsterdam, Netherlands

[3] Univ Grenoble Alps, VERIMAG, Grenoble, France

[4] IIIT Hyderabad, Hyderabad, Telangana, India

来源：

CODASPY'16: PROCEEDINGS OF THE SIXTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY | 2016年

关键词：

D O I：

10.1145/2857705.2857720

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With sustained growth of software complexity, finding security vulnerabilities in operating systems has become an important necessity. Nowadays, OS are shipped with thousands of binary executables. Unfortunately, methodologies and tools for an OS scale program testing within a limited time budget are still missing. In this paper we present an approach that uses lightweight static and dynamic features to predict if a test case is likely to contain a software vulnerability using machine learning techniques. To show the effectiveness of our approach, we set up a large experiment to detect easily exploitable memory corruptions using 1039 Debian programs obtained from its bug tracker, collected 138,308 unique execution traces and statically explored 76,083 different subsequences of function calls. We managed to predict with reasonable accuracy which programs contained dangerous memory corruptions. We also developed and implemented VDiscovER, a tool that uses state-of-the-art Machine Learning techniques to predict vulnerabilities in test cases. Such tool will be released as open-source to encourage the research of vulnerability discovery at a large scale, together with VDISCOVERY, a public dataset that collects raw analyzed data.

引用

页码：85 / 96

页数：12

共 50 条

[11] Efficient Machine Learning On Large-Scale Graphs
Erickson, Parker
Lee, Victor E.
Shi, Feng
Tang, Jiliang
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4788 - 4789
[12] Large-scale kernel extreme learning machine
Deng, Wan-Yu
Zheng, Qing-Hua
Chen, Lin
Jisuanji Xuebao/Chinese Journal of Computers, 2014, 37 (11): : 2235 - 2246
[13] Machine learning for large-scale MOF screening
Coupry, Damien
Groot, Laurens
Addicoat, Matthew
Heine, Thomas
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
[14] Robust Large-Scale Machine Learning in the Cloud
Rendle, Steffen
Fetterly, Dennis
Shekita, Eugene J.
Su, Bor-yiing
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1125 - 1134
[15] Large-scale Machine Learning over Graphs
Yang, Yiming
PROCEEDINGS OF THE 2018 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'18), 2018, : 9 - 9
[16] Large-Scale Machine Learning and Neuroimaging in Psychiatry
Thompson, Paul
BIOLOGICAL PSYCHIATRY, 2018, 83 (09) : S51 - S51
[17] Coding for Large-Scale Distributed Machine Learning
Xiao, Ming
Skoglund, Mikael
ENTROPY, 2022, 24 (09)
[18] Resource Elasticity for Large-Scale Machine Learning
Huang, Botong
Boehm, Matthias
Tian, Yuanyuan
Reinwald, Berthold
Tatikonda, Shirish
Reiss, Frederick R.
SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 137 - 152
[19] TensorFlow: A system for large-scale machine learning
Abadi, Martin
Barham, Paul
Chen, Jianmin
Chen, Zhifeng
Davis, Andy
Dean, Jeffrey
Devin, Matthieu
Ghemawat, Sanjay
Irving, Geoffrey
Isard, Michael
Kudlur, Manjunath
Levenberg, Josh
Monga, Rajat
Moore, Sherry
Murray, Derek G.
Steiner, Benoit
Tucker, Paul
Vasudevan, Vijay
Warden, Pete
Wicke, Martin
Yu, Yuan
Zheng, Xiaoqiang
PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, 2016, : 265 - 283
[20] Optimization Methods for Large-Scale Machine Learning
Bottou, Leon
Curtis, Frank E.
Nocedal, Jorge
SIAM REVIEW, 2018, 60 (02) : 223 - 311

← 1 2 3 4 5 →