Skew handling techniques in sort-merge join

Wei Li, Dengfeng Gao, Richard Thomas Snodgrass

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Joins are among the most frequently executed operations. Several fast join algorithms have been developed and extensively studied; these can be categorized as sort-merge, hash-based, and index-based algorithms. While all three types of algorithms exhibit excellent performance over most data, ameliorating the performance degradation in the presence of skew has been investigated only for hash-based algorithms. However, for sort-merge join, even a small amount of skew present in realistic data can result in a significant performance hit on a commercial DBMS. This paper examines the negative ramifications of skew in sort-merge join and proposes several refinements that deal effectively with data skew. Experiments show that some of these algorithms also impose virtually no penalty in the absence of data skew and are thus suitable for replacing existing sort-merge implementations. We also show how sort-merge band join performance is significantly enhanced with these refinements.

Original languageEnglish (US)
Title of host publicationProceedings of the 2002 ACM SIGMOD International Conference on Management of Data, SIGMOD 2002
PublisherAssociation for Computing Machinery, Inc
Pages169-180
Number of pages12
ISBN (Electronic)1581134975, 9781581134971
DOIs
StatePublished - Jun 3 2002
Externally publishedYes
Event2002 ACM SIGMOD International Conference on Management of Data, SIGMOD 2002 - Madison, United States
Duration: Jun 3 2002Jun 6 2002

Publication series

NameProceedings of the 2002 ACM SIGMOD International Conference on Management of Data, SIGMOD 2002

Conference

Conference2002 ACM SIGMOD International Conference on Management of Data, SIGMOD 2002
Country/TerritoryUnited States
CityMadison
Period6/3/026/6/02

ASJC Scopus subject areas

  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'Skew handling techniques in sort-merge join'. Together they form a unique fingerprint.

Cite this