MSFuzz: Augmenting Protocol Fuzzing with Message Syntax Comprehension via Large Language Models

被引:1
|
作者
Cheng, Mingjie [1 ,2 ]
Zhu, Kailong [1 ,2 ]
Chen, Yuanchao [1 ,2 ]
Yang, Guozheng [1 ,2 ]
Lu, Yuliang [1 ,2 ]
Lu, Canju [1 ,2 ]
机构
[1] Natl Univ Def Technol, Coll Elect Engn, Hefei 230037, Peoples R China
[2] Anhui Prov Key Lab Cyberspace Secur Situat Awarene, Hefei 230037, Peoples R China
关键词
fuzzing; syntax aware; protocol implementations; large language models; FUZZER;
D O I
10.3390/electronics13132632
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Network protocol implementations, as integral components of information communication, are critically important for security. Due to its efficiency and automation, fuzzing has become a popular method for protocol security detection. However, the existing protocol-fuzzing techniques face the critical problem of generating high-quality inputs. To address the problem, in this paper, we propose MSFuzz, which is a protocol-fuzzing method with message syntax comprehension. The core observation of MSFuzz is that the source code of protocol implementations contains detailed and comprehensive knowledge of the message syntax. Specifically, we leveraged the code-understanding capabilities of large language models to extract the message syntax from the source code and construct message syntax trees. Then, using these syntax trees, we expanded the initial seed corpus and designed a novel syntax-aware mutation strategy to guide the fuzzing. To evaluate the performance of MSFuzz, we compared it with the state-of-the-art (SOTA) protocol fuzzers, namely, AFLNET and CHATAFL. Experimental results showed that compared with AFLNET and CHATAFL, MSFuzz achieved average improvements of 22.53% and 10.04% in the number of states, 60.62% and 19.52% improvements in the number of state transitions, and 29.30% and 23.13% improvements in branch coverage. Additionally, MSFuzz discovered more vulnerabilities than the SOTA fuzzers.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Automated Commit Message Generation With Large Language Models: An Empirical Study and Beyond
    Xue, Pengyu
    Wu, Linhao
    Yu, Zhongxing
    Jin, Zhi
    Yang, Zhen
    Li, Xinyi
    Yang, Zhenyu
    Tan, Yue
    IEEE Transactions on Software Engineering, 2024, 50 (12) : 3208 - 3224
  • [22] Impact of Non-Standard Unicode Characters on Security and Comprehension in Large Language Models
    Daniel, Johan S.
    Pal, Anand
    Research Square,
  • [23] Guiding Large Language Models via Directional Stimulus Prompting
    Li, Zekun
    Peng, Baolin
    He, Pengcheng
    Galley, Michel
    Gao, Jianfeng
    Yan, Xifeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [24] Augmenting Large Language Models with Rules for Enhanced Domain-Specific Interactions: The Case of Medical Diagnosis
    Panagoulias, Dimitrios P.
    Virvou, Maria
    Tsihrintzis, George A.
    Grandi, Fabio
    Song, Liang
    ELECTRONICS, 2024, 13 (02)
  • [25] Data Stealing Attacks against Large Language Models via Backdooring
    He, Jiaming
    Hou, Guanyu
    Jia, Xinyue
    Chen, Yangyang
    Liao, Wenqi
    Zhou, Yinhang
    Zhou, Rang
    ELECTRONICS, 2024, 13 (14)
  • [26] Time Series Classification With Large Language Models via Linguistic Scaffolding
    Jang, Hyeongwon
    Yong Yang, June
    Hwang, Jaeryong
    Yang, Eunho
    IEEE Access, 2024, 12 : 170387 - 170398
  • [27] Capturing Failures of Large Language Models via Human Cognitive Biases
    Jones, Erik
    Steinhardt, Jacob
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [28] Difficulty aware programming knowledge tracing via large language models
    Lina Yang
    Xinjie Sun
    Hui Li
    Ran Xu
    Xuqin Wei
    Scientific Reports, 15 (1)
  • [29] Towards Autonomous Testing Agents via Conversational Large Language Models
    Feldt, Robert
    Kang, Sungmin
    Yoon, Juyeon
    Yoo, Shin
    2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1688 - 1693
  • [30] Computational Thematic Analysis of Poetry via Bimodal Large Language Models
    Choi K.
    Proceedings of the Association for Information Science and Technology, 2023, 60 (01) : 538 - 542