ABSTRACT

This chapter examines the audit advantages and methodologies for analysis of Big Data. Big Data is a term given to large data sets containing a variety of data types. Due to the sheer volume of data involved, conventional two-dimensional database management systems were unable to handle Big Data in an acceptable manner. This led to the development of online analytical processing (OLAP). Other technologies for data structuring are rapidly infusing the marketplace and include the following: NoSQL databases, analytic RDBMSs and Hadoop. Hive basic unit of data is a table in a similar manner to a relational database in that it consists of a two-dimensional "spreadsheet" structure with rows containing records. Due to the sheer volumes involved as well as the velocity and the unstructured nature of some of the data, the ability of auditors to use statistical sampling may be essential and difficult.