A Semantics Aware Approach to Automated Reverse Engineering Unknown Protocols

被引:0
|
作者
Wang, Yipeng [1 ,3 ]
Yun, Xiaochun [4 ]
Shafiq, M. Zubair [2 ]
Wang, Liyan [2 ]
Liu, Alex X. [2 ]
Zhang, Zhibin [1 ]
Yao, Danfeng [5 ]
Zhang, Yongzheng [6 ]
Guo, Li [6 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
[3] Chinese Acad Sci, Grad Sch, Beijing, Peoples R China
[4] Coordinat Ctr China, Natl Comp Network Emergency Response Tech Team, Beijing, Peoples R China
[5] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24061 USA
[6] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
来源
2012 20TH IEEE INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (ICNP) | 2012年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system integration. In this paper, we propose ProDecoder, a network trace based protocol message format inference system that exploits the semantics of protocol messages without the executable code of application protocols. ProDecoder is based on the key insight that the n-grams of protocol traces exhibit highly skewed frequency distribution that can be leveraged for accurate protocol message format inference. In ProDecoder, we first discover the latent relationship among n-grams by first grouping protocol messages with the same semantics and then inferring message formats by keyword based clustering and cluster sequence alignment. We implemented and evaluated ProDecoder to infer message format specifications of SMB (a binary protocol) and SMTP (a textual protocol). Our experimental results show that ProDecoder accurately parses and infers SMB protocol with 100% precision and recall. For SMTP, ProDecoder achieves approximately 95% precision and recall.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] AUTOMATED REVERSE ENGINEERING OF CAN PROTOCOLS
    Weiss, N.
    Pozzobon, E.
    Mottok, J.
    Matousek, V
    NEURAL NETWORK WORLD, 2021, 31 (04) : 279 - 295
  • [2] A Semantics-Aware Approach to Automated Claim Verification
    Figueras, Blanca Calvo
    Cuadros, Montse
    Agerri, Rodrigo
    PROCEEDINGS OF THE FIFTH FACT EXTRACTION AND VERIFICATION WORKSHOP (FEVER 2022), 2022, : 37 - 48
  • [3] Toward Automated Field Semantics Inference for Binary Protocol Reverse Engineering
    Zhan, Mengqi
    Li, Yang
    Li, Bo
    Zhang, Jinchao
    Li, Chuanrong
    Wang, Weiping
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 764 - 776
  • [4] A Semantics-Aware Approach to the Automated Network Protocol Identification
    Yun, Xiaochun
    Wang, Yipeng
    Zhang, Yongzheng
    Zhou, Yu
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2016, 24 (01) : 583 - 595
  • [5] Network Message Field Type Clustering for Reverse Engineering of Unknown Binary Protocols
    Kleber, Stephan
    Kargl, Frank
    Stute, Ilan
    Hollick, Matthias
    52ND ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOP VOLUME (DSN-W 2022), 2022, : 80 - 87
  • [6] A formal automated approach for reverse engineering programs with pointers
    Gannod, GC
    Cheng, BHC
    AUTOMATED SOFTWARE ENGINEERING, 12TH IEEE INTERNATIONAL CONFERENCE, PROCEEDINGS, 1997, : 219 - 226
  • [7] A Data-driven Approach for Reverse Engineering Electric Power Protocols
    Ouyang Liu
    Bin Zheng
    Wei Sun
    Feipeng Luo
    Zhonghe Hong
    Xiaowei Wang
    Bo Li
    Journal of Signal Processing Systems, 2021, 93 : 769 - 777
  • [8] A Data-driven Approach for Reverse Engineering Electric Power Protocols
    Liu, Ouyang
    Zheng, Bin
    Sun, Wei
    Luo, Feipeng
    Hong, Zhonghe
    Wang, Xiaowei
    Li, Bo
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (07): : 769 - 777
  • [9] An automated approach for supporting software reuse via reverse engineering
    Gannod, GC
    Chen, YH
    Cheng, BHC
    13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, PROCEEDINGS, 1998, : 94 - 103
  • [10] Automated analysis of scientific and engineering semantics
    Stewart, MEM
    9TH INTERNATIONAL WORKSHOP ON PROGRAM COMPREHENSION, PROCEEDINGS, 2001, : 113 - 114