TY - GEN
T1 - Processing recursive XQuery over XML streams
T2 - 22nd International Conference on Data Engineering Workshops, ICDEW 2006
AU - Wei, Mingzhu
AU - Li, Ming
AU - Rundensteiner, Elke A.
AU - Mani, Murali
N1 - Funding Information:
Prof. Rundensteiner is a well-known expert in databases and information systems, having spend over 20 years of her career focussing on the development of scalable data management technology in support of advanced applications including manufacturing and automation, human genome and digital libraries. Her current research interests include scalable stream data management, XML and web data management, data integration and migration, data warehousing for distributed systems, and large-scale visual information exploration. She has over 280 publications in these and related areas. Her research has been funded by government agencies including NSF, NIH and industry like IBM, Verizon Labs, GTE, NEC, and others. She has been recipient of numerous honors and awards, including the NSF Young Investigator Grant, Sigma Xi Outstanding Senior Faculty Researcher Award, and WPI Trustees’ Award for outstanding research and creative scholarship. She is on numerous program committees of prestigious conferences in the database field and editor of several journals, including Associate Editor of the IEEE Transactions on Data and Knowledge Engineering Journal .
Publisher Copyright:
© 2006 IEEE.
PY - 2006
Y1 - 2006
N2 - XML stream applications bring the challenge of efficiently processing queries on sequentially accessible tokenbased data. For efficient processing of queries, we need to ensure that memory usage stays low. This in turn requires that we avoid holding data in the query buffer, by outputting it at the earliest possible time. In this paper, we propose a new class of stream algebra operators for efficient recursive XQuery stream processing. In particular we propose two strategies for implementing structural joins: (a) the just-in-Time structural join strategy efficiently processes joins as long as the input XML substreams are non-recursive and (b) the recursive structural join strategy supports structural joins over recursive XML substreams, however at an added cost of tuple-level ID-comparisons. Both structural join strategies are complemented by an automatadriven invocation mechanism that triggers the execution of the join at the first possible moment upon recognizing the end of the targeted input stream subelement. Further, we design this structural join operator itself to be context-Aware. The operator is capable of at run-Time switching from the efficient just-intime join strategy for elements that are recognized to be nonrecursive to the more powerful id-based structural join strategy for elements that are identified to be recursive. In addition, depending on whether the query is recursive, we will generate the plan with cheaper operators whenever possible. We incorporate the proposed techniques into the Raindrop stream engine. We also report on experimental studies we conducted using ToXgene that show that our techniques brings significant performance improvement.
AB - XML stream applications bring the challenge of efficiently processing queries on sequentially accessible tokenbased data. For efficient processing of queries, we need to ensure that memory usage stays low. This in turn requires that we avoid holding data in the query buffer, by outputting it at the earliest possible time. In this paper, we propose a new class of stream algebra operators for efficient recursive XQuery stream processing. In particular we propose two strategies for implementing structural joins: (a) the just-in-Time structural join strategy efficiently processes joins as long as the input XML substreams are non-recursive and (b) the recursive structural join strategy supports structural joins over recursive XML substreams, however at an added cost of tuple-level ID-comparisons. Both structural join strategies are complemented by an automatadriven invocation mechanism that triggers the execution of the join at the first possible moment upon recognizing the end of the targeted input stream subelement. Further, we design this structural join operator itself to be context-Aware. The operator is capable of at run-Time switching from the efficient just-intime join strategy for elements that are recognized to be nonrecursive to the more powerful id-based structural join strategy for elements that are identified to be recursive. In addition, depending on whether the query is recursive, we will generate the plan with cheaper operators whenever possible. We incorporate the proposed techniques into the Raindrop stream engine. We also report on experimental studies we conducted using ToXgene that show that our techniques brings significant performance improvement.
UR - http://www.scopus.com/inward/record.url?scp=84990998517&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84990998517&partnerID=8YFLogxK
U2 - 10.1109/ICDEW.2006.119
DO - 10.1109/ICDEW.2006.119
M3 - Conference contribution
AN - SCOPUS:84990998517
T3 - ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops
BT - ICDEW 2006 - Proceedings of the 22nd International Conference on Data Engineering Workshops
A2 - Zhou, Xiaofang
A2 - Barga, Roger S.
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 3 April 2006 through 7 April 2006
ER -