Efficient POSIX submatch extraction on nondeterministic finite automata

被引：3

作者：

Borsotti, Angelo ^{[1
]}

Trofimovich, Ulya ^{[2
]}

机构：

[1] Polytech Univ Milan, Dept Elect Informat & Bioengn, Milan, Italy

[2] Belarusian State Univ, Dept Discrete Math & Algorithm, Minsk, BELARUS

来源：

SOFTWARE-PRACTICE & EXPERIENCE | 2021年 / 51卷 / 02期

关键词：

finite-state automata; parsing; POSIX; regular expressions; submatch extraction;

D O I：

10.1002/spe.2881

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In this paper we study the performance of POSIX submatch extraction algorithms based on nondeterministic finite automata (NFA). We propose an algorithm that combines Laurikari tagged NFA and extended Okui-Suzuki disambiguation. The algorithm works in worst-caseO(n m(2) t)time andO(m(2))space (including preprocessing), wherenis the length of input,mis the size of the regular expression with bounded repetition expanded andtis the number of capturing groups and subexpressions that contain them. On real-world benchmarks our algorithm performs close to theO(n m t)complexity of leftmost-greedy matching, although on artificial benchmarks it can be significantly slower. We propose a lazy version of the algorithm that runs much faster, but requiresO(n m(2))space. We show that the Kuklewicz algorithm is slower in practice, and the backward matching algorithm proposed by Cox is incorrect.

引用

页码：159 / 192

页数：34

共 50 条

[1] Reversible Nondeterministic Finite Automata
Holzer, Markus
Kutrib, Martin
REVERSIBLE COMPUTATION, RC 2017, 2017, 10301 : 35 - 51
[2] Extended Nondeterministic Finite Automata
Melnikov, Boris
FUNDAMENTA INFORMATICAE, 2010, 104 (03) : 255 - 265
[3] On an expansion of nondeterministic finite automata
Melnikov B.
Journal of Applied Mathematics and Computing, 2007, 24 (1-2) : 155 - 165
[4] ON STATE MINIMIZATION OF NONDETERMINISTIC FINITE AUTOMATA
KAMEDA, T
WEINER, P
IEEE TRANSACTIONS ON COMPUTERS, 1970, C 19 (07) : 617 - &
[5] Finite nondeterministic automata: Simulation and minimality
Calude, CS
Calude, E
Khoussainov, B
THEORETICAL COMPUTER SCIENCE, 2000, 242 (1-2) : 219 - 235
[6] Descriptional complexity of nondeterministic finite automata
Salomaa, Kai
Developments in Language Theory, Proceedings, 2007, 4588 : 31 - 35
[7] An algebraic theory of nondeterministic finite automata
Gorrieri, Roberto
JOURNAL OF LOGICAL AND ALGEBRAIC METHODS IN PROGRAMMING, 2025, 145
[8] Simplifying Nondeterministic Finite Cover Automata
Campeanu, Cezar
ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2014, (151): : 162 - 173
[9] On Parallel Induction of Nondeterministic Finite Automata
Jastrzab, Tomasz
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 : 257 - 268
[10] Parallel Induction of Nondeterministic Finite Automata
Jastrzab, Tomasz
Czech, Zbigniew J.
Wieczorek, Wojciech
PARALLEL PROCESSING AND APPLIED MATHEMATICS, PPAM 2015, PT I, 2016, 9573 : 248 - 257

← 1 2 3 4 5 →