Fuzzing Java']JavaScript engines with a syntax-aware neural program model

被引:0
|
作者
Xu, Haoran [1 ]
Wang, Yongjun [1 ]
Jiang, Zhiyuan [1 ]
Fan, Shuhui [1 ]
Fu, Shaojing [1 ]
Xie, Peidai [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
基金
中国国家自然科学基金;
关键词
Fuzzing; !text type='Java']Java[!/text]Script engines; Language model; Neural network; Grammar; Vocabulary;
D O I
10.1016/j.cose.2024.103947
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Neural network language modeling has become a remarkable approach in the generation of test cases for fuzzing JavaScript engines. Fuzzers built upon neural language models offer several advantages. They obviate the need for manually developing code generation rules, enable the extraction of patterns from high -quality seed sets, and exhibit commendable portability. Nevertheless, existing works confront challenges in three key aspects: diminished language modeling performance attributable to extensive vocabularies, potential semantic errors within generated test cases, and the limitation of black -box fuzzing, which fails to leverage the internal feedback from the target engine. This paper proposes an innovative neural model -based grey -box fuzzing approach for JavaScript engines. We incorporate the context -free grammar of JavaScript into the neural language model to mitigate the challenges associated with extensive vocabularies, thereby enhancing the model's performance. Furthermore, to enhance the semantic validity of the generated test cases, we introduce semantic constraints into the mutation process. Notably, this work pioneers the integration of grey -box testing into a fuzzer built upon a neural language model, thereby enhancing the exploration of deep paths. Our prototype, PMFuzz, surpasses NNLMbased counterparts in both language modeling performance and test case generation capabilities. PMFuzz demonstrates a high level of competitiveness in exploring the software state space when compared to traditional coverage -guided grey -box fuzzers. In our evaluation, PMFuzz successfully identified 20 new defects within mainstream JS engines. Eight of them have been confirmed and fixed. Moreover, upon applying our method to C compilers, PMFuzz has revealed 11 new defects.
引用
下载
收藏
页数:14
相关论文
共 27 条
  • [21] Syntax-aware Neural Semantic Role Labeling for Morphologically Rich Languages
    Vasic, Daniel
    Vasic, Mirela Kundid
    2020 28TH INTERNATIONAL CONFERENCE ON SOFTWARE, TELECOMMUNICATIONS AND COMPUTER NETWORKS (SOFTCOM), 2020, : 327 - 332
  • [22] Change-aware Dynamic Program Analysis for Java']JavaScript
    Murthy, Dileep Ramachandrarao Krishna
    Pradel, Michael
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, : 127 - 137
  • [23] Metapath and syntax-aware heterogeneous subgraph neural networks for spam review detection
    Zhang, Zhiqiang
    Dong, Yuhang
    Wu, Haiyan
    Song, Haiyu
    Deng, Shengchun
    Chen, Yanhong
    APPLIED SOFT COMPUTING, 2022, 128
  • [24] CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in Java']JavaScript Engines
    Han, HyungSeok
    Oh, DongHyeon
    Cha, Sang Kil
    26TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2019), 2019,
  • [25] A novel syntax-aware automatic graphics code generation with attention-based deep neural network
    Pang, Xiongwen
    Zhou, Yanqiang
    Li, Pengcheng
    Lin, Weiwei
    Wu, Wentai
    Wang, James Z.
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2020, 161
  • [26] Syntax and Domain Aware Model for Unsupervised Program Translation
    Liu, Fang
    Li, Jia
    Zhang, Li
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 755 - 767
  • [27] Montage: A Neural Network Language Model-Guided Java']JavaScript Engine Fuzzer
    Lee, Suyoung
    Han, HyungSeok
    Cha, Sang Kul
    Son, Sooel
    PROCEEDINGS OF THE 29TH USENIX SECURITY SYMPOSIUM, 2020, : 2613 - 2630