Efficiently loading and processing XML streams

Ming Li, Murali Mani, Elke A. Rundensteiner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

XML stream applications bring the novel challenge of efficiently processing queries on sequentially accessible token-based input streams. Our Raindrop project is the first to accommodate token-based stream processing using an algebraic framework where both tokens and tuples are modeled in a uniform manner. In this paper, we illustrate how the stream loading model of our system on the fly conducts XML navigation over the input stream via concurrently constructing a minimized light-weight XML tree representation, which is called navigation-free data instance. These captured XML fragments are minimized in terms of buffer consumption. Based on the compact representation of the navigation-free data instances, we propose techniques for subsequent algebraic query evaluation, in particular, effective strategies for supporting multi-mode query operators and alternative data output semantics. The proposed stream loading model requires a much smaller buffer footprint, compared to alternative solutions in the literature such as Y-Filter. And the proposed algebra-based evaluation techniques offer effective ways to handle data recursion over XML streams, i.e., avoiding overhead from the structural join operators. Our stream loading and query evaluation techniques have been implemented as part of the Raindrop system. Experimental results based on the Raindrop system are also reported in this paper.

Original languageEnglish (US)
Title of host publicationProceedings of IDEAS'08
Subtitle of host publicationInternational Database Engineering and Applications Symposium
Pages59-67
Number of pages9
DOIs
StatePublished - 2008
Externally publishedYes
EventInternational Database Engineering and Applications Symposium, IDEAS'08 - Coimbra, Portugal
Duration: Sep 10 2008Sep 12 2008

Publication series

NameACM International Conference Proceeding Series
Volume299

Conference

ConferenceInternational Database Engineering and Applications Symposium, IDEAS'08
Country/TerritoryPortugal
CityCoimbra
Period9/10/089/12/08

Keywords

  • XML
  • XQuery
  • query algebra
  • query processing
  • stream

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Efficiently loading and processing XML streams'. Together they form a unique fingerprint.

Cite this