Abstract
The query optimization phase within a database management system (DBMS) ostensibly finds the fastest query execution plan from a potentially large set of enumerated plans, all of which correctly compute the specified query. Occasionally the cost-based optimizer selects a slower plan, for a variety of reasons. We introduce the notion of empirical suboptimality of a query plan chosen by the DBMS, indicated by the existence of a query plan that performs more efficiently than the chosen plan, for the same query. From an engineering perspective, it is of critical importance to understand the prevalence of suboptimality and its causal factors. We examined the plans for thousands of queries run on four DBMSes, resulting in over a million query executions. We previously observed that the construct of empirical suboptimality prevalence positively correlated with the number of operators in the DBMS. An implication is that as operators are added to a DBMS, the prevalence of slower queries will grow. Through a novel experiment that examines the plans on the query/cardinality combinations, we present evidence for a previously unknown upper bound on the number of operators a DBMS may be able to support before performance suffers. We show that this upper bound may have already been reached.
Original language | English (US) |
---|---|
Pages (from-to) | 181-200 |
Number of pages | 20 |
Journal | VLDB Journal |
Volume | 31 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2022 |
Externally published | Yes |
Keywords
- Query algebraic operator
- Query optimization
- Query suboptimality
ASJC Scopus subject areas
- Information Systems
- Hardware and Architecture