Toward Large-Scale Vulnerability Discovery using Machine Learning

被引:146
|
作者
Grieco, Gustavo [1 ]
Grinblat, Guillermo Luis [1 ]
Uzal, Lucas [1 ]
Rawat, Sanjay [2 ,4 ]
Feist, Josselin [3 ]
Mounier, Laurent [3 ]
机构
[1] CIFASIS CONICET, Rosario, Santa Fe, Argentina
[2] Vrije Univ Amsterdam, Syst Secur Grp, Amsterdam, Netherlands
[3] Univ Grenoble Alps, VERIMAG, Grenoble, France
[4] IIIT Hyderabad, Hyderabad, Telangana, India
关键词
D O I
10.1145/2857705.2857720
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With sustained growth of software complexity, finding security vulnerabilities in operating systems has become an important necessity. Nowadays, OS are shipped with thousands of binary executables. Unfortunately, methodologies and tools for an OS scale program testing within a limited time budget are still missing. In this paper we present an approach that uses lightweight static and dynamic features to predict if a test case is likely to contain a software vulnerability using machine learning techniques. To show the effectiveness of our approach, we set up a large experiment to detect easily exploitable memory corruptions using 1039 Debian programs obtained from its bug tracker, collected 138,308 unique execution traces and statically explored 76,083 different subsequences of function calls. We managed to predict with reasonable accuracy which programs contained dangerous memory corruptions. We also developed and implemented VDiscovER, a tool that uses state-of-the-art Machine Learning techniques to predict vulnerabilities in test cases. Such tool will be released as open-source to encourage the research of vulnerability discovery at a large scale, together with VDISCOVERY, a public dataset that collects raw analyzed data.
引用
收藏
页码:85 / 96
页数:12
相关论文
共 50 条
  • [21] Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing
    Sun, Chong
    Rampalli, Narasimhan
    Yang, Frank
    Doan, Anhai
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (13): : 1529 - 1540
  • [22] Deep learning large-scale drug discovery and repurposing
    Yu, Min
    Li, Weiming
    Yu, Yunru
    Zhao, Yu
    Xiao, Lizhi
    Lauschke, Volker M.
    Cheng, Yiyu
    Zhang, Xingcai
    Wang, Yi
    NATURE COMPUTATIONAL SCIENCE, 2024, 4 (08): : 600 - 614
  • [23] Open Challenges in Developing Generalizable Large-Scale Machine-Learning Models for Catalyst Discovery
    Kolluru, Adeesh
    Shuaibi, Muhammed
    Palizhati, Aini
    Shoghi, Nima
    Das, Abhishek
    Wood, Brandon
    Zitnick, C. Lawrence
    Kitchin, John R.
    Ulissi, Zachary W.
    ACS CATALYSIS, 2022, 12 (14): : 8572 - 8581
  • [24] Efficient Large-Scale Machine Learning Techniques for Rapid Motif Discovery in Energy Data Streams
    Lykothanasi, K. K.
    Sioutas, S.
    Tsichlas, K.
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2022, PART I, 2022, 646 : 331 - 342
  • [25] Toward a Large-Scale Characterization of the Learning Chain Reaction
    Samsonovich, Alexei V.
    COGNITION IN FLUX, 2010, : 2308 - 2313
  • [26] Ageing Analysis of Embedded SRAM on a Large-Scale Testbed Using Machine Learning
    Lanzieri, Leandro
    Kietzmann, Peter
    Fey, Goerschwin
    Schlarb, Holger
    Schmidt, Thomas C.
    2023 26TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN, DSD 2023, 2023, : 335 - 342
  • [27] A GENERIC TRUST FRAMEWORK FOR LARGE-SCALE OPEN SYSTEMS USING MACHINE LEARNING
    Liu, Xin
    Tredan, Gilles
    Datta, Anwitaman
    COMPUTATIONAL INTELLIGENCE, 2014, 30 (04) : 700 - 721
  • [28] Large-Scale Machine Learning for Business Sector Prediction
    Angenent, Mitch N.
    Barata, Antonio Pereira
    Takes, Frank W.
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 1143 - 1146
  • [29] Compressed Linear Algebra for Large-Scale Machine Learning
    Elgohary, Ahmed
    Boehm, Matthias
    Haas, Peter J.
    Reiss, Frederick R.
    Reinwald, Berthold
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (12): : 960 - 971
  • [30] Angel: a new large-scale machine learning system
    Jiang, Jie
    Yu, Lele
    Jiang, Jiawei
    Liu, Yuhong
    Cui, Bin
    NATIONAL SCIENCE REVIEW, 2018, 5 (02) : 216 - 236