Toward Large-Scale Vulnerability Discovery using Machine Learning

被引:146
|
作者
Grieco, Gustavo [1 ]
Grinblat, Guillermo Luis [1 ]
Uzal, Lucas [1 ]
Rawat, Sanjay [2 ,4 ]
Feist, Josselin [3 ]
Mounier, Laurent [3 ]
机构
[1] CIFASIS CONICET, Rosario, Santa Fe, Argentina
[2] Vrije Univ Amsterdam, Syst Secur Grp, Amsterdam, Netherlands
[3] Univ Grenoble Alps, VERIMAG, Grenoble, France
[4] IIIT Hyderabad, Hyderabad, Telangana, India
关键词
D O I
10.1145/2857705.2857720
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With sustained growth of software complexity, finding security vulnerabilities in operating systems has become an important necessity. Nowadays, OS are shipped with thousands of binary executables. Unfortunately, methodologies and tools for an OS scale program testing within a limited time budget are still missing. In this paper we present an approach that uses lightweight static and dynamic features to predict if a test case is likely to contain a software vulnerability using machine learning techniques. To show the effectiveness of our approach, we set up a large experiment to detect easily exploitable memory corruptions using 1039 Debian programs obtained from its bug tracker, collected 138,308 unique execution traces and statically explored 76,083 different subsequences of function calls. We managed to predict with reasonable accuracy which programs contained dangerous memory corruptions. We also developed and implemented VDiscovER, a tool that uses state-of-the-art Machine Learning techniques to predict vulnerabilities in test cases. Such tool will be released as open-source to encourage the research of vulnerability discovery at a large scale, together with VDISCOVERY, a public dataset that collects raw analyzed data.
引用
收藏
页码:85 / 96
页数:12
相关论文
共 50 条
  • [41] Dynamic Control Flow in Large-Scale Machine Learning
    Yu, Yuan
    Abadi, Martin
    Barham, Paul
    Brevdo, Eugene
    Burrows, Mike
    Davis, Andy
    Dean, Jeff
    Ghemawat, Sanjay
    Harley, Tim
    Hawkins, Peter
    Isard, Michael
    Kudlur, Manjunath
    Monga, Rajat
    Murray, Derek
    Zheng, Xiaoqiang
    EUROSYS '18: PROCEEDINGS OF THE THIRTEENTH EUROSYS CONFERENCE, 2018,
  • [42] Large-Scale Machine Learning Approaches for Molecular Biophysics
    Ramanathan, Arvind
    Chennubhotla, Chakra S.
    Agarwal, Pratul K.
    Stanley, Christopher B.
    BIOPHYSICAL JOURNAL, 2015, 108 (02) : 370A - 370A
  • [43] Large-Scale Machine Learning at Verizon: Theory and Applications
    Srivastava, Ashok
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 417 - 417
  • [44] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
  • [45] Compressed linear algebra for large-scale machine learning
    Elgohary, Ahmed
    Boehm, Matthias
    Haas, Peter J.
    Reiss, Frederick R.
    Reinwald, Berthold
    VLDB JOURNAL, 2018, 27 (05): : 719 - 744
  • [46] Quick extreme learning machine for large-scale classification
    Albtoush, Audi
    Fernandez-Delgado, Manuel
    Cernadas, Eva
    Barro, Senen
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08): : 5923 - 5938
  • [47] Angel: a new large-scale machine learning system
    Jie Jiang
    Lele Yu
    Jiawei Jiang
    Yuhong Liu
    Bin Cui
    NationalScienceReview, 2018, 5 (02) : 216 - 236
  • [48] AI for Discovery and Diagnosis of Brain Diseases using Deep Learning and Large-Scale Neuroimaging
    Thompson, Paul
    ANNALS OF NEUROLOGY, 2024, 96 : S302 - S303
  • [49] Using Deep Learning and Machine Learning Methods to Diagnose Hailstorms in Large-Scale Thermodynamic Environments
    Pulukool, Farha
    Li, Longzhuang
    Liu, Chuntao
    SUSTAINABILITY, 2020, 12 (24) : 1 - 13
  • [50] Machine learning framework for gut microbiome biomarkers discovery and modulation analysis in large-scale obese population
    Liu, Yaoliang
    Zhu, Jinlin
    Wang, Hongchao
    Lu, Wenwei
    Lee, Yuan Kun
    Zhao, Jianxin
    Zhang, Hao
    BMC GENOMICS, 2022, 23 (01)