Accelerating Regular Expression Matching Over Compressed HTTP

被引:0
|
作者
Becchi, Michela [1 ]
Bremler-Barr, Anat [2 ]
Hay, David [3 ]
Kochba, Omer [2 ]
Koral, Yaron [3 ]
机构
[1] Univ Missouri, Columbia, MO 65211 USA
[2] Interdisciplinary Ctr Herzliya, Herzliyya, Israel
[3] Hebrew Univ Jerusalem, IL-91905 Jerusalem, Israel
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper focuses on regular expression matching over compressed traffic. The need for such matching arises from two independent trends. First, the volume and share of compressed IITTP traffic is constantly increasing. Second, due to their superior expressibility, current Deep Packet Inspection engines use regular expressions more and more frequently. We present an algorithmic framework to accelerate such matching, taking advantage of information gathered when the traffic was initially compressed. HTTP compression is typically performed through the GZIP protocol, which uses back references to repeated strings. Our algorithm is based on calculating (for every byte) the minimum number of (previous) bytes that can be part of a future regular expression matching. When inspecting a back -reference, only these bytes should be taken into account, thus enabling one to skip repeated strings almost entirely without missing a match. We show that our generic framework works with either NFA-based or DFA-based implementations and gains performance boosts of more than 70%. Moreover, it can be readily adapted to most existing regular expression matching algorithms, which usually are based either on NFA, DFA or combinations of the two. Finally, we discuss other applications in which calculating the number of relevant bytes becomes handy, even when the traffic is not compressed.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Regular matching and inclusion on compressed tree patterns with constrained context variables
    Boneva, Iovka
    Niehren, Joachim
    Sakho, Momar
    INFORMATION AND COMPUTATION, 2022, 286
  • [42] Faster approximate string matching over compressed text
    Navarro, G
    Kida, T
    Takeda, M
    Shinohara, A
    Arikawa, S
    DCC 2001: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2001, : 459 - 468
  • [43] String matching over compressed text on handheld devices
    Bellaachia, A
    Al Rassan, I
    ESA'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS AND APPLICATIONS, 2003, : 80 - 86
  • [44] Type Inference for Regular Expression Pattern Matching
    Marin, Mircea
    Craciun, Adrian
    12TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2010), 2011, : 366 - 373
  • [45] Complexity analysis of extended regular expression matching
    Takahashi, Kazuya
    Minamide, Yasuhiko
    1600, Japan Society for Software Science and Technology (38): : 53 - 70
  • [46] Formalising Boost POSIX Regular Expression Matching
    Berglund, Martin
    Bester, Willem
    van der Merwe, Brink
    THEORETICAL ASPECTS OF COMPUTING - ICTAC 2018, 2018, 11187 : 99 - 115
  • [47] TiReX: Tiled Regular eXpression matching architecture
    Comodi, Alessandro
    Conficconi, Davide
    Scolari, Alberto
    Santambrogio, Marco D.
    2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 131 - 137
  • [48] An Improved DFA for Fast Regular Expression Matching
    Ficara, Domenico
    Giordano, Stefano
    Procissi, Gregorio
    Vitucci, Fabio
    Antichi, Gianni
    Di Pietro, Andrea
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2008, 38 (05) : 31 - 40
  • [49] Multiple regular expression matching hardware architecture
    Zhang, Wei
    Xue, Yibo
    Song, Tian
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2009, 49 (10): : 1704 - 1707
  • [50] Streaming Regular Expression Membership and Pattern Matching
    Dudek, Bartlomiej
    Gawrychowski, Pawel
    Gourdel, Garance
    Starikovskaya, Tatiana
    PROCEEDINGS OF THE 2022 ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2022, : 670 - 694