ABSTRACT

This chapter introduces big data systems that are associated with big volume, variety, velocity, and veracity. It describes the characteristic features of such systems, including big data architecture and NoSQL data management. The chapter introduces the characteristics and examples of NoSQL databases, namely, column, key-value, document, and graph databases. It provides a snapshot overview of graph databases OrientDB and Neo4j. The chapter explores the challenges of big data computing. A final important characteristic of big data computing systems is the inherent scalability of the underlying hardware and software architecture. A variety of system architectures have been implemented for big data and large-scale data analysis applications, including parallel and distributed relational database management systems, which have been available to run on shared-nothing clusters of processing nodes for more than two decades. These include database systems from Teradata, Netezza, Vertica, Exadata/Oracle, and others, which provide high-performance parallel database platforms.