Recovery of high-level intermediate representations of algorithms from binary code

被引:0
|
作者
Bugerya, Alexander Borisovich [1 ]
Kulagin, Ivan Ivanovich [2 ]
Padaryan, Vartan Andronikovich [2 ,3 ]
Solovev, Mikhail Aleksandrovich [2 ,3 ]
Tikhonov, Andrei Yur'evich [2 ]
机构
[1] Russian Acad Sci, Keldysh Inst Appl Math, Moscow, Russia
[2] Russian Acad Sci, Ivannikov Inst Syst Programming, Moscow, Russia
[3] Lomonosov Moscow State Univ, Moscow, Russia
关键词
flowcharts; intermediate representation; binary code analysis; data flow analysis;
D O I
10.1109/IVMEM.2019.00015
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
One of the tasks of binary code security analysis is detection of undocumented features in software. This task is hard to automate, and it requires participation of a cybersecurity expert. The way of representation of the algorithm under analysis strongly determines the analysis effort and quality of its results. Existing intermediate representations and languages are intended for use in software that either carries out optimizing transformations or analyzes binary code. Such representations and intermediate languages are unsuitable for manual data flow analysis. This paper proposes a high-level hierarchical flowchart-based representation of a program algorithm as well as an algorithm for its construction. The proposed representation is based on a hypergraph and it allows both automatic and manual data flow analysis on different detail levels. The hypergraph nodes represent functions. Every node contains a set of other nodes which are fragments. The fragment is a linear sequence of instructions that does not contain call and ret instructions. Edges represent data flows between nodes and correspond to memory buffers and registers. In the future this representation can be used to implement automatic analysis algorithms. An approach is proposed to increasing quality of the developed algorithm representation using grouping of single data flows into one flow connecting logical algorithm modules.
引用
收藏
页码:57 / 63
页数:7
相关论文
共 50 条
  • [21] Learning Portable Representations for High-Level Planning
    James, Steven
    Rosman, Benjamin
    Konidaris, George
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [22] DISCRIMINATIVE HIGH-LEVEL REPRESENTATIONS FOR SCENE CLASSIFICATION
    Zhang, Lei
    Xie, Shouzhi
    Zhen, Xiantong
    [J]. 2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 4345 - 4348
  • [23] The Relationship of Code Coverage Metrics on High-level and RTL Code
    Sanguinetti, John
    Zhang, Eugene
    [J]. 2010 IEEE INTERNATIONAL HIGH LEVEL DESIGN VALIDATION AND TEST WORKSHOP (HLDVT), 2010, : 138 - 141
  • [24] Simple High-Level Code for Cryptographic Arithmetic
    Erbsen, Andres
    Philipoom, Jade
    Gross, Jason
    Sloan, Robert
    Chlipala, Adam
    [J]. 1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (54): : 23 - 30
  • [25] Compiling mercury to high-level C code
    Henderson, F
    Somogyi, Z
    [J]. COMPILER CONSTRUCTION, PROCEEDINGS, 2002, 2304 : 197 - 212
  • [26] Stepwise abstraction of high-level system specifications from source code
    Ferrarotti, Flavio
    Moser, Michael
    Pichler, Josef
    [J]. JOURNAL OF COMPUTER LANGUAGES, 2020, 60 (60)
  • [27] Automatic Code Generation for Embedded Systems from High-Level Models
    Riid, A.
    Preden, J.
    Pahtma, R.
    Serg, R.
    Lints, T.
    [J]. ELEKTRONIKA IR ELEKTROTECHNIKA, 2009, (07) : 33 - 36
  • [28] High-Level Separation Logic for Low-Level Code
    Jensen, Jonas B.
    Benton, Nick
    Kennedy, Andrew
    [J]. ACM SIGPLAN NOTICES, 2013, 48 (01) : 301 - 313
  • [29] Cross-Language Binary-Source Code Matching with Intermediate Representations
    Gui, Yi
    Wan, Yao
    Zhang, Hongyu
    Huang, Huifang
    Sui, Yulei
    Xu, Guandong
    Shao, Zhiyuan
    Jin, Hai
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 601 - 612
  • [30] A High-Level Language for Modeling Algorithms and Their Properties
    Akhtar, Sabina
    Merz, Stephan
    Quinson, Martin
    [J]. FORMAL METHODS: FOUNDATIONS AND APPLICATIONS, 2011, 6527 : 49 - +