ABSTRACT

The origins of big data technologies come from database systems and distributed systems as well as data mining and machine learning algorithms that can process these vast amounts of data to extract the necessary knowledge. This chapter addresses issues involved in distributed databases wherein a database is stored on more than one computer. Distributed computing studies the models, architectures, and algorithms used for building and managing distributed systems. System architectural styles cover the physical organization of components and processes over a distributed infrastructure. Software architectural styles are based on the logical arrangement of software components. They are helpful because they provide an intuitive view of the whole system, despite its physical deployment. Middleware is connectivity software that is designed to help manage the complexity and heterogeneity inherent in distributed systems by building a bridge between different systems, thereby enabling the communication and transfer of data.