Regular expression pattern matching for XML

被引:18
|
作者
Hosoya, H [1 ]
Pierce, B [1 ]
机构
[1] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
关键词
D O I
10.1145/373243.360209
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We propose regular expression pattern matching as a core feature for programming languages for manipulating XML (and similar tree-structured data formats). We extend conventional pattern-matching facilities with regular expression operators such as repetition (*), alternation (I), etc., that can match arbitrarily long sequences of subtrees, allowing a compact pattern to extract data from the middle of a complex sequence. We show how to check standard notions of exhaustiveness and redundancy for these patterns. Regular expression patterns are intended to be used in languages whose type systems are also based on the regular expression types. To avoid excessive type annotations, we develop a type inference scheme that propagates type constraints to pattern variables from the surrounding context. The type inference algorithm translates types and patterns into regular tree automata and then works in terms of standard closure operations (union, intersection, and difference) on tree automata. The main technical challenge is dealing with the interaction of repetition and alternation patterns with the first-match policy, which gives rise to subtleties concerning both the termination and the precision of the analysis. We address these issues by introducing a data structure representing closure operations lazily.
引用
收藏
页码:67 / 80
页数:14
相关论文
共 50 条
  • [41] A glance on current XML twig pattern matching algorithms
    Dao, Duy-Bo
    Cao, Jinli
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2008, PT 2, PROCEEDINGS, 2008, 5073 : 307 - +
  • [42] XML twig pattern matching using version tree
    Wu, Xin
    Liu, Guiquan
    DATA & KNOWLEDGE ENGINEERING, 2008, 64 (03) : 580 - 599
  • [43] A fast tree pattern matching algorithm for XML query
    Yao, JT
    Zhang, M
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 235 - 241
  • [44] FREME: A pattern partition based engine for fast and scalable regular expression matching in practice
    Wang, Kai
    Li, Jun
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2015, 55 : 154 - 169
  • [45] Extending Graph Pattern Matching with Regular Expressions
    Wang, Xin
    Wang, Yang
    Xu, Yang
    Zhang, Ji
    Zhong, Xueyan
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2020, PT II, 2020, 12392 : 111 - 129
  • [46] State complexity of pattern matching in regular languages
    Brzozowski, Janusz A.
    Davies, Sylvie
    Madan, Abhishek
    THEORETICAL COMPUTER SCIENCE, 2019, 777 : 121 - 131
  • [47] A new regular grammar pattern matching algorithm
    Watson, BW
    THEORETICAL COMPUTER SCIENCE, 2003, 299 (1-3) : 509 - 521
  • [48] Regular biosequence pattern matching with cellular automata
    Laurio, K
    Linåker, F
    Narayanan, A
    INFORMATION SCIENCES, 2002, 146 (1-4) : 89 - 101
  • [49] Research on XML Ordered Regular Tree Pattern Optimization Method
    E Xinyu
    Liao Husheng
    Su Hang
    ICAIP 2018: 2018 THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN IMAGE PROCESSING, 2018, : 229 - 234
  • [50] Fast and compact regular expression matching
    Bille, Philip
    Farach-Colton, Martin
    THEORETICAL COMPUTER SCIENCE, 2008, 409 (03) : 486 - 496