Fuzzing Java']JavaScript engines with a syntax-aware neural program model

被引:0
|
作者
Xu, Haoran [1 ]
Wang, Yongjun [1 ]
Jiang, Zhiyuan [1 ]
Fan, Shuhui [1 ]
Fu, Shaojing [1 ]
Xie, Peidai [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
基金
中国国家自然科学基金;
关键词
Fuzzing; !text type='Java']Java[!/text]Script engines; Language model; Neural network; Grammar; Vocabulary;
D O I
10.1016/j.cose.2024.103947
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Neural network language modeling has become a remarkable approach in the generation of test cases for fuzzing JavaScript engines. Fuzzers built upon neural language models offer several advantages. They obviate the need for manually developing code generation rules, enable the extraction of patterns from high -quality seed sets, and exhibit commendable portability. Nevertheless, existing works confront challenges in three key aspects: diminished language modeling performance attributable to extensive vocabularies, potential semantic errors within generated test cases, and the limitation of black -box fuzzing, which fails to leverage the internal feedback from the target engine. This paper proposes an innovative neural model -based grey -box fuzzing approach for JavaScript engines. We incorporate the context -free grammar of JavaScript into the neural language model to mitigate the challenges associated with extensive vocabularies, thereby enhancing the model's performance. Furthermore, to enhance the semantic validity of the generated test cases, we introduce semantic constraints into the mutation process. Notably, this work pioneers the integration of grey -box testing into a fuzzer built upon a neural language model, thereby enhancing the exploration of deep paths. Our prototype, PMFuzz, surpasses NNLMbased counterparts in both language modeling performance and test case generation capabilities. PMFuzz demonstrates a high level of competitiveness in exploring the software state space when compared to traditional coverage -guided grey -box fuzzers. In our evaluation, PMFuzz successfully identified 20 new defects within mainstream JS engines. Eight of them have been confirmed and fixed. Moreover, upon applying our method to C compilers, PMFuzz has revealed 11 new defects.
引用
收藏
页数:14
相关论文
共 27 条
  • [1] Evaluating seed selection for fuzzing Java']JavaScript engines
    Wen, Ming
    Wang, Yongcong
    Xia, Yifan
    Jin, Hai
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (06)
  • [2] Fuzzing Java']JavaScript Engines with Aspect-preserving Mutation
    Park, Soyeon
    Xu, Wen
    Yun, Insu
    Jang, Daehee
    Kim, Taesoo
    [J]. 2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2020), 2020, : 1628 - 1642
  • [3] SoFi: Reflection-Augmented Fuzzing for Java']JavaScript Engines
    He, Xiaoyu
    Xie, Xiaofei
    Li, Yuekang
    Sun, Jianwen
    Li, Feng
    Zou, Wei
    Liu, Yang
    Yu, Lei
    Zhou, Jianhua
    Shi, Wenchang
    Huo, Wei
    [J]. CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 2229 - 2242
  • [4] Syntax-Aware Neural Semantic Role Labeling
    Xia, Qingrong
    Li, Zhenghua
    Zhang, Min
    Zhang, Meishan
    Fu, Guohong
    Wang, Rui
    Si, Luo
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7305 - 7313
  • [5] Syntax-aware entity representations for neural relation extraction
    He, Zhengqiu
    Chen, Wenliang
    Li, Zhenghua
    Zhang, Wei
    Shao, Hao
    Zhang, Min
    [J]. ARTIFICIAL INTELLIGENCE, 2019, 275 : 602 - 617
  • [6] Syntax-aware Transformer Encoder for Neural Machine Translation
    Duan, Sufeng
    Zhao, Hai
    Zhou, Junru
    Wang, Rui
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 396 - 401
  • [7] Syntax-aware Neural Semantic Role Labeling with Supertags
    Kasai, Jungo
    Friedman, Dan
    Frank, Robert
    Radev, Dragomir
    Rambow, Owen
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 701 - 709
  • [8] Syntax-Aware Data Augmentation for Neural Machine Translation
    Duan, Sufeng
    Zhao, Hai
    Zhang, Dongdong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2988 - 2999
  • [9] A lightweight and high-precision approach for bulky Java']JavaScript engines fuzzing
    Zhou, Lianpei
    Xiao, Xi
    Hu, Guangwu
    Li, Hao
    Wu, Xiangbo
    Zhou, Tao
    [J]. 2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 982 - 989
  • [10] Automated Conformance Testing for Java']JavaScript Engines via Deep Compiler Fuzzing
    Ye, Guixin
    Tang, Zhanyong
    Tan, Shin Hwei
    Huang, Songfang
    Fang, Dingyi
    Sun, Xiaoyang
    Bian, Lizhong
    Wang, Haibo
    Wang, Zheng
    [J]. PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21), 2021, : 435 - 450