ABSTRACT

In this chapter we discuss some important real-world scenarios that are not adequately addressed by the physical design problem as defined in Part II. We then explain how to generalize both the problem formulation and corresponding techniques to address these limitations. Consider, as a motivating example, the following query:

SELECT a, b, c, d, e FROM R WHERE a = 10

and suppose that a single tuple from R satisfies a=10. If the space budget allows it, a covering index IC = R(a, b, c, d, e) would be the best alternative for the query, requiring a single input/output (I/O) to locate the qualifying row and all the required columns. Now consider a narrow single-column index IN = R(a). In this case, we would require two I/Os to answer the query (one to locate the record id (RID) of the qualifying tuple from the secondary index IN and another to fetch the relevant tuple from the primary index). In absolute terms, IC results in a better execution plan than IN . However, the execution plan that uses IN is only slightly less efficient than the one that uses IC (especially compared with the simple alternative that performs a sequential scan over table R), and at the same time it looks simpler. If updates on columns b, c, d, or e are possible, it might make sense to penalize wide indexes such as IC from appearing in the final configuration. However, current techniques cannot explicitly model this requirement without resorting to artificial changes. For instance, we could simulate this behavior by introducing UPDATE statements in the workload. This mechanism, however, is not general enough to capture other important scenarios that we discuss below. In any case, the previous example does not lend itself to a new “golden rule” of tuning. There are situations for which the covering index is the superior alternative (e.g., there could be no updates on table R by design). In fact, an application that repeatedly and almost exclusively executes the aforementioned query can result in a 50% improvement when using the covering index IC instead of the narrow alternative IN .