Efficient POSIX submatch extraction on nondeterministic finite automata

被引:3
|
作者
Borsotti, Angelo [1 ]
Trofimovich, Ulya [2 ]
机构
[1] Polytech Univ Milan, Dept Elect Informat & Bioengn, Milan, Italy
[2] Belarusian State Univ, Dept Discrete Math & Algorithm, Minsk, BELARUS
来源
SOFTWARE-PRACTICE & EXPERIENCE | 2021年 / 51卷 / 02期
关键词
finite-state automata; parsing; POSIX; regular expressions; submatch extraction;
D O I
10.1002/spe.2881
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper we study the performance of POSIX submatch extraction algorithms based on nondeterministic finite automata (NFA). We propose an algorithm that combines Laurikari tagged NFA and extended Okui-Suzuki disambiguation. The algorithm works in worst-caseO(n m(2) t)time andO(m(2))space (including preprocessing), wherenis the length of input,mis the size of the regular expression with bounded repetition expanded andtis the number of capturing groups and subexpressions that contain them. On real-world benchmarks our algorithm performs close to theO(n m t)complexity of leftmost-greedy matching, although on artificial benchmarks it can be significantly slower. We propose a lazy version of the algorithm that runs much faster, but requiresO(n m(2))space. We show that the Kuklewicz algorithm is slower in practice, and the backward matching algorithm proposed by Cox is incorrect.
引用
收藏
页码:159 / 192
页数:34
相关论文
共 50 条
  • [21] Complexity of Unary Exclusive Nondeterministic Finite Automata
    Kutrib, Martin
    Malcher, Andreas
    Wendlandt, Matthias
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2024, (407):
  • [22] LANGUAGES RECOGNIZED BY NONDETERMINISTIC QUANTUM FINITE AUTOMATA
    Yakaryilmaz, Abuzer
    Say, A. C. Cem
    QUANTUM INFORMATION & COMPUTATION, 2010, 10 (9-10) : 747 - 770
  • [23] Converting finite width AFAs to nondeterministic and universal finite automata
    Zakzok, Mohammad
    Salomaa, Kai
    THEORETICAL COMPUTER SCIENCE, 2024, 996
  • [24] A Novel Stream Cipher Based on Nondeterministic Finite Automata
    Khaleel, Ghassan
    Turaev, Sherzod
    Zhukabayeva, Tamara
    PROCEEDINGS OF THE 2016 CONFERENCE ON INFORMATION TECHNOLOGIES IN SCIENCE, MANAGEMENT, SOCIAL SPHERE AND MEDICINE (ITSMSSM), 2016, 51 : 110 - 115
  • [25] A lower bound technique for the size of nondeterministic finite automata
    Glaister, I
    Shallit, J
    INFORMATION PROCESSING LETTERS, 1996, 59 (02) : 75 - 77
  • [26] Robust models to infer flexible nondeterministic finite automata
    Jastrzab, Tomasz
    Lardeux, Frederic
    Monfroy, Eric
    JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 79
  • [27] A Supervisor Synthesis Tool for Finite Nondeterministic Automata with Data
    Kirilov, Aleksandar
    Martinovikj, Darko
    Mishevski, Kristijan
    Petkovska, Marija
    Trajcheska, Zlatka
    Markovski, Jasen
    SOFTWARE ENGINEERING AND FORMAL METHODS, 2014, 8368 : 101 - 112
  • [28] FINITE AUTOMATA HAVING COST FUNCTIONS - NONDETERMINISTIC MODELS
    IBARAKI, T
    INFORMATION AND CONTROL, 1978, 37 (01): : 40 - 69
  • [29] On input-revolving deterministic and nondeterministic finite automata
    Bensch, Suna
    Bordihn, Henning
    Holzer, Markus
    Kutrib, Martin
    INFORMATION AND COMPUTATION, 2009, 207 (11) : 1140 - 1155
  • [30] Some algorithms for equivalent transformation of nondeterministic finite automata
    B. F. Mel’nikov
    M. R. Saifullina
    Russian Mathematics, 2009, 53 (4) : 54 - 57