ABSTRACT

The Reference pipe takes its name from the notion that the foci of all of the activity in this pipe are the Reference tables of each of the dimensions in the star model (Figure 10.1). The individual transactional purposes might include resolving natural keys to their respective entries in the Definition tables, building Hierarchy entries to relate two Definition entries, or defining alias natural keys for previously defined or existing keys. All of these transaction types share the need to receive, format, look up, and possibly create Reference entries for natural keys that occur in the source systems. This Reference pipe handles all of these situations, adding considerable additional value to the processing as it occurs. Specifically, the processing covered in the Reference pipe includes

◾ Transforming source-only intake data into source-to-target staged data through a join to the Metadata dimension, including codeset translation of nonstandard source values

◾ Formatting each of up to 10 natural key source values into the standard single-valued source natural key known as the Reference Composite

◾ Resolving a Master ID for the dimension against each Reference Composite, either because the Master ID already exists in the dimension or it is created for each distinct Reference Composite not found in the dimension

◾ Inserting new dimension data for newly created Master IDs, including the standard Group, Bridge, and Hierarchy entries required of all new Definition entries in each dimension

◾ Inserting new alias entries in the dimension Reference tables ◾ Inserting new fiat hierarchy entries in the dimension Hierarchy tables ◾ Inserting new dimension Hierarchy entries for any natural hierarchies that have been impacted by the creation of new Reference entries in the dimensions

The processing in this Reference pipe will very likely be much easier to build and test than you might be expecting. The dimension design pattern covered

back in Chapter 5 allows for each component of the dimension model to be implemented independently of each of the other components. To take advantage of this, it is very important that you are able to let go of the relational or transactional paradigm under which you’ve probably implemented databases in the past. That paradigm is exemplified by the notion of foreign keys existing between tables, with your database management system enforcing referential integrity among the tables as data are inserted. In this design pattern, we have very few actual foreign key possibilities among the six tables in the dimension.