ABSTRACT

The exponential growth of data in recent years, together with our ability to process, store and analyse it has created a whole new world of opportunities. Data is an asset with a characteristic life cycle where each phase is associated with a specific set of questions that must be addressed for the proper exploitation of the data. The basic structure of the scientific method is a process that starts with a guess, the question, which must be objectively answered by direct observations or with scientific tools. Computers are unable to answer most types of questions, including those that require empirical evidence, mathematical reasoning as well as those related to love, family, democracy, ethics, aesthetics and faith. Uber's exponential growth called for the centralisation of all data in an analytical data warehouse. The increasingly large datasets that were being handled on a daily basis introduced functional bottlenecks into this first Hadoop architecture.