ABSTRACT

By understanding the data pipeline and organising the data and processes correctly, it is possible to iterate a data project. In many organisations, people see that they struggle with this, and their flagship data warehouse projects start to get into three-monthly release cycles, where much of the effort is tied up in regression testing and handling side effects. This does not meet the requirements of dynamic organisations to make decisions rapidly and effectively. Typically, executives want management information in hours or days; waiting three months can be the difference between being successful or not. It is critical then that the delivery of pipelines is broken down effectively, and processes that need to iterate fast are put close to the change agent. It is also important to break dependencies where they do not need to exist. The key to doing this is to put in place the right building blocks that perform the right functions.