Efficient filtering of XML documents with XPath expressions

C.-Y. Chan; P. Felber; M. Garofalakis; R. Rastogi

doi:10.1007/s00778-002-0077-6

Loading next page...

References (26)

The Intel Corporation (2000) Intel netStructure XML accelera- tors
E. Hanson, Moez Chaabouni, Chang-Ho Kim, Yu-Wang Wang (1990)
A predicate matching algorithm for database rule systems
Prakash Ramanan (2002)
Efficient algorithms for minimizing tree pattern queries
(1999)
http://www.oasis.open.org/cover/sgml-xml.html
(1999)
XPath) 1.0
SAX: A Simple API for XML
Antonio Carzaniga, David Rosenblum, A. Wolf (2001)
Design and evaluation of a wide-area event notification service
Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems]
Mehmet Altinel, M. Franklin (2000)
Efficient Filtering of XML Documents for Selective Dissemination of Information
F. Fabret, H. Jacobsen, F. Llirbat, J. Pereira, K. Ross, D. Shasha (2001)
Filtering algorithms and implementation for very fast publish/subscribe systems
S. Amer-Yahia, SungRan Cho, L. Lakshmanan, D. Srivastava (2001)
Minimization of tree pattern queries
(2000)
Extensible Markup Language (XML) 1.0, 2nd Edition
S. Amer-Yahia, SungRan Cho, D. Srivastava (2002)
Tree Pattern Relaxation
James Clark, S. DeRose (1999)
XML Path Language (XPath)
D. Knuth (1998)
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Bill Segall, David Arnold, Julian Boot, Michael Henderson, Ted Phelps (2000)
Content Based Routing with Elvin4
Torsten Schlieder (2002)
Schema-Driven Evaluation of Approximate Tree-Pattern Queries
(1999)
XML Generator
Lauren Wood, Vidur Apparao, Mike Champion, G. Nicol, Inso Eps, J. Robie, Chris Wilson (2000)
Document Object Model (DOM) Level 1 Specification (Second Edition)
D. Knuth (1973)
The art of computer programming: sorting and searching (volume 3)
D. Knuth (1998)
The Art of Computer Programming: Volume 3: Sorting and Searching
(1949)
Human Behaviour and Principle of Least Effort
(1999)
The SGML/XML web
E. Hanson, C. Carnes, Lan Huang, Mohan Konyala, Lloyd Noronha, S. Parthasarathy, J. Park, Albert Vernon (1999)
Scalable trigger processing
Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337)
D. Knuth (1968)
The Art of Computer Programming
B. Nguyen, S. Abiteboul, G. Cobena, M. Preda (2001)
Monitoring XML data on the Web
M. Aguilera, R. Strom, D. Sturman, Mark Astley, T. Chandra (1999)
Matching events in a content-based subscription system

Publisher: Springer Journals
Copyright: Copyright © 2002 by Springer-Verlag Berlin Heidelberg
Subject: Computer Science; Database Management
ISSN: 1066-8888
eISSN: 0949-877X
DOI: 10.1007/s00778-002-0077-6
Publisher site: See Article on Publisher Site

Abstract

The publish/subscribe paradigm is a popular model for allowing publishers (i.e., data generators) to selectively disseminate data to a large number of widely dispersed subscribers (i.e., data consumers) who have registered their interest in specific information items. Early publish/subscribe systems have typically relied on simple subscription mechanisms, such as keyword or ”bag of words” matching, or simple comparison predicates on attribute values. The emergence of XML as a standard for information exchange on the Internet has led to an increased interest in using more expressive subscription mechanisms (e.g., based on XPath expressions) that exploit both the structure and the content of published XML documents. Given the increased complexity of these new data-filtering mechanisms, the problem of effectively identifying the subscription profiles that match an incoming XML document poses a difficult and important research challenge. In this paper, we propose a novel index structure, termed XTrie, that supports the efficient filtering of XML documents based on XPath expressions. Our XTrie index structure offers several novel features that, we believe, make it especially attractive for large-scale publish/subscribe systems. First, XTrie is designed to support effective filtering based on complex XPath expressions (as opposed to simple, single-path specifications). Second, our XTrie structure and algorithms are designed to support both ordered and unordered matching of XML data. Third, by indexing on sequences of elements organized in a trie structure and using a sophisticated matching algorithm, XTrie is able to both reduce the number of unnecessary index probes as well as avoid redundant matchings, thereby providing extremely efficient filtering. Our experimental results over a wide range of XML document and XPath expression workloads demonstrate that our XTrie index structure outperforms earlier approaches by wide margins.

Journal

The VLDB Journal – Springer Journals

Published: Dec 1, 2002

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Efficient filtering of XML documents with XPath expressions

Efficient filtering of XML documents with XPath expressions

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Efficient filtering of XML documents with XPath expressions

Efficient filtering of XML documents with XPath expressions

References (26)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies