Efficient filtering of XML documents with XPath expressions

被引:86
|
作者
Chan, CY [1 ]
Felber, P [1 ]
Garofalakis, M [1 ]
Rastogi, R [1 ]
机构
[1] Bell Labs, Lucent Technol, Murray Hill, NJ 07974 USA
来源
VLDB JOURNAL | 2002年 / 11卷 / 04期
关键词
data dissemination; document filtering; index structure; XML; XPath;
D O I
10.1007/s00778-002-0077-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The publish/subscribe paradigm is a popular model for allowing publishers (i.e., data generators) to selectively disseminate data to a large number of widely dispersed subscribers (i.e., data consumers) who have registered their interest in specific information items. Early publish/subscribe systems have typically relied on simple subscription mechanisms, such as keyword or "bag of words" matching, or simple comparison predicates on attribute values. The emergence of XML as a standard for information exchange on the Internet has led to an increased interest in using more expressive subscription mechanisms (e.g., based on XPath expressions) that exploit both the structure and the content of published XML documents. Given the increased complexity of these new datafiltering mechanisms, the problem of effectively identifying the subscription profiles that match an incoming XML document poses a difficult and important research challenge. In this paper, we propose a novel index structure, termed XTrie, that supports the efficient filtering of XML documents based on XPath expressions. Our XTrie index structure offers several novel features that, we believe, make it especially attractive for large-scale publish/subscribe systems. First, XTrie is designed to support effective filtering based on complex XPath expressions (as opposed to simple, single-path specifications). Second, our XTrie structure and algorithms are designed to support both ordered and unordered matching of XML data. Third, by indexing on sequences of elements organized in a trie structure and using a sophisticated matching algorithm XTrie is able to both reduce the number of unnecessary index probes as well as avoid redundant matchings, thereby providing extremely efficient filtering. Our experimental results over a wide range of XML document and XPath expression workloads demonstrate that our XTrie index structure outperforms earlier approaches by wide margins.
引用
下载
收藏
页码:354 / 379
页数:26
相关论文
共 50 条
  • [31] Efficient fragmentation of large XML documents
    Bonifati, Angela
    Cuzzocrea, Alfredo
    Database and Expert Systems Applications, Proceedings, 2007, 4653 : 539 - 550
  • [32] Value-based predicate filtering of XML documents
    Kwon, Joonho
    Rao, Praveen
    Moon, Bongki
    Lee, Sukho
    DATA & KNOWLEDGE ENGINEERING, 2008, 67 (01) : 51 - 73
  • [33] ENERGY EFFICIENT XPATH QUERY PROCESSING ON WIRELESS XML STREAMING DATA
    Prabhavathy, P.
    Bose, S.
    Kannan, A.
    COMPUTING AND INFORMATICS, 2015, 34 (06) : 1289 - 1308
  • [34] XPath fragments on XML in columns
    Pokorny, Jaroslav
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2013, 9 (04) : 317 - 329
  • [35] A hybird method for efficient indexing of XML documents
    Sun, W
    Liu, DX
    DEEC 2005: INTERNATIONAL WORKSHOP ON DATA ENGINEERING ISSUES IN E-COMMERCE, PROCEEDINGS, 2005, : 139 - 143
  • [36] Efficient integrity checking over XML documents
    Braga, Daniele
    Campi, Alessandro
    Martinenghi, Davide
    CURRENT TRENDS IN DATABASE TECHNOLOGY - EDBT 2006, 2006, 4254 : 206 - 219
  • [37] On efficient matching of streaming XML documents and queries
    Lakshmanan, LVS
    Parthasarathy, S
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2002, 2002, 2287 : 142 - 160
  • [38] Efficient XQuery over Encrypted XML Documents
    Rauf, Azhar
    Ali, Waqas
    Ahmed, Maher
    Khusro, Shah
    Ali, Shaukat
    10TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2015), 2015, : 159 - 162
  • [39] Collective signature for efficient authentication of XML documents
    Ray, I
    Kim, E
    SECURITY AND PROTECTION IN INFORMATION PROCESSING SYSTEMS, 2004, 147 : 411 - 424
  • [40] An Efficient Duplicate Detection System for XML Documents
    Lwin, Thandar
    Nyunt, Thi Thi Soe
    2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS: ICCEA 2010, PROCEEDINGS, VOL 2, 2010, : 178 - 182