Grammar-based whitebox fuzzing

被引:83
|
作者
Godefroid, Patrice [1 ]
Kiezun, Adam [2 ]
Levin, Michael Y. [3 ]
机构
[1] Microsoft Res, Redmond, WA USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[3] Microsoft Ctr Software Excellence, Redmond, WA USA
关键词
verification; algorithms; reliability; software testing; automatic test generation; grammars; program verification;
D O I
10.1145/1379022.1375607
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages. In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.
引用
收藏
页码:206 / 215
页数:10
相关论文
共 50 条
  • [31] Grammar-based Automatic Extraction of Definitions
    Iftene, Adrian
    Pistol, Ionut
    Trandabat, Diana
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, 2009, : 110 - 115
  • [32] Astraea: Grammar-Based Fairness Testing
    Soremekun, Ezekiel
    Udeshi, Sakshi
    Chattopadhyay, Sudipta
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (12) : 5188 - 5211
  • [33] Grammar-based Tree Swarm Optimization
    Grinan, David
    Ibias, Alfredo
    Nunez, Manuel
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 76 - 81
  • [34] A Grammar-Based Framework for Rehabilitation Exergames
    Fernandez-Cervantes, Victor
    Stroulia, Eleni
    Hunter, Benjamin
    [J]. ENTERTAINMENT COMPUTING - ICEC 2016, 2016, 9926 : 38 - 50
  • [35] Grammar-Based String Refinement Types
    Zhu, Fengmin
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 267 - 269
  • [36] Grammar-based geodesics in semantic networks
    Rodriguez, Marko A.
    Watkins, Jennifer H.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2010, 23 (08) : 844 - 855
  • [37] Grammar-Based Compression of Unranked Trees
    Gascon, Adria
    Lohrey, Markus
    Maneth, Sebastian
    Reh, Carl Philipp
    Sieber, Kurt
    [J]. THEORY OF COMPUTING SYSTEMS, 2020, 64 (01) : 141 - 176
  • [38] Grammar-based test generation with YouGen
    Hoffman, Daniel Malcolm
    Ly-Gagnon, David
    Strooper, Paul
    Wang, Hong-Yi
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2011, 41 (04): : 427 - 447
  • [39] Grammar-based connectionist approaches to language
    Smolensky, P
    [J]. COGNITIVE SCIENCE, 1999, 23 (04) : 589 - 613
  • [40] GRAMMAR-BASED DEFINITION OF METAPROGRAMMING SYSTEMS
    CAMERON, RD
    ITO, MR
    [J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1984, 6 (01): : 20 - 54