Evaluating Linear XPath Expressions by Pattern-Matching Automata

被引:0
|
作者
Silvasti, Panu [1 ]
Sippu, Seppo [2 ]
Soisalon-Soininen, Eljas [1 ]
机构
[1] Helsinki Univ Technol, FIN-02150 Espoo, Finland
[2] Univ Helsinki, FIN-00014 Helsinki, Finland
基金
芬兰科学院;
关键词
filtering of streams of XML documents; linear XPath expressions;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consider the problem of efficiently evaluating a large number of XPath expressions, especially in the case when they define subscriber profiles for filtering of XML documents. For each document in an XML document stream, the task is to determine those profiles that match the document. In this article we present a new general method for filtering with profiles expressed by linear XPath expressions with child operators (/), descendant operators (//), and wildcards (*). This new filtering algorithm is based on a backtracking deterministic finite automaton derived from the classic Aho-Corasick pattern-matching automaton. This automaton has a size linear in the sum of the sizes of the XPath filters, and the worst-case time bound of the algorithm is much less than the time bound of the simulation of linear-size nondeterministic automata. Our new algorithm has a predecessor that can handle child and descendant operators but not wildcards, and has been shown to be extremely efficient when a document-type definition (DTD) has been used to prune out all the wildcards and most of the descendant operators. But in some cases, such as when the DTD is highly recursive, it may not be possible to prune out all wildcards without producing a too large set of filters. Then it is important to have the full generality of an evaluation algorithm, as presented in this article, that can also handle wildcards.
引用
收藏
页码:833 / 851
页数:19
相关论文
共 50 条
  • [1] An algorithm for pattern-matching mathematical expressions
    Hemer, D
    [J]. FORMAL METHODS PACIFIC '97, 1997, : 103 - 123
  • [2] AUTOMATA AND PATTERN-MATCHING IN PLANAR DIRECTED ACYCLIC GRAPHS
    BOSSUT, F
    WARIN, B
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 583 : 76 - 86
  • [3] Efficient concise deterministic pattern-matching automata for ambiguous patterns
    Nedjah, N
    Mourelle, LD
    [J]. ACM SIGPLAN NOTICES, 2002, 37 (02) : 57 - 67
  • [4] ALGORITHMS FOR MINIMIZATION OF FINITE ACYCLIC AUTOMATA AND PATTERN-MATCHING IN TERMS
    KRIVOI, SL
    [J]. CYBERNETICS, 1991, 27 (03): : 324 - 331
  • [5] LINEAR FEATURE COMPATIBILITY FOR PATTERN-MATCHING RELAXATION
    CUCKA, P
    ROSENFELD, A
    [J]. PATTERN RECOGNITION, 1992, 25 (02) : 189 - 196
  • [6] Efficient automata-driven pattern-matching for equational programs
    Nedjah, N
    Walter, CD
    Eldridge, SE
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 1999, 29 (09): : 793 - 813
  • [7] Minimal deterministic left-to-right pattern-matching automata
    Nedjah, N
    [J]. ACM SIGPLAN NOTICES, 1998, 33 (01) : 40 - 47
  • [8] ADAPTIVE PATTERN-MATCHING
    SEKAR, RC
    RAMESH, R
    RAMAKRISHNAN, IV
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 623 : 247 - 260
  • [9] APPROXIMATE PATTERN-MATCHING
    MANBER, U
    WU, S
    [J]. BYTE, 1992, 17 (12): : 281 - +
  • [10] ALGORITHMS FOR PATTERN-MATCHING
    DAVIES, G
    BOWSHER, S
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 1986, 16 (06): : 575 - 601