ABSTRACT

Successfully navigating the data-driven economy presupposes a certain understanding of the technologies and methods to gain insights from Big Data. This book aims to help data science practitioners to successfully manage the transition to Big Data.  
Building on familiar content from applied econometrics and business analytics, this book introduces the reader to the basic concepts of Big Data Analytics. The focus of the book is on how to productively apply econometric and machine learning techniques with large, complex data sets, as well as on all the steps involved before analysing the data (data storage, data import, data preparation). The book combines conceptual and theoretical material with the practical application of the concepts using R and SQL. The reader will thus acquire the skills to analyse large data sets, both locally and in the cloud. Various code examples and tutorials, focused on empirical economic and business research, illustrate practical techniques to handle and analyse Big Data.  

Key Features: 
 
- Includes many code examples in R and SQL, with R/SQL scripts freely provided online.  
- Extensive use of real datasets from empirical economic research and business analytics, with data files freely provided online.  
- Leads students and practitioners to think critically about where the bottlenecks are in practical data analysis tasks with large data sets, and how to address them.  
 

The book is a valuable resource for data science practitioners, graduate students and researchers who aim to gain insights from big data in the context of research questions in business, economics, and the social sciences. 

part I|24 pages

Setting the Scene: Analyzing Big Data

chapter 1|2 pages

What is Big in “Big Data”?

chapter 2|6 pages

Approaches to Analyzing Big Data

chapter 3|12 pages

The Two Domains of Big Data Analytics

part II|88 pages

Platform: Software and Computing Resources

chapter 4|28 pages

Software: Programming with (Big) Data

chapter 5|22 pages

Hardware: Computing Resources

chapter 6|16 pages

Distributed Systems

chapter 7|16 pages

Cloud Computing

part III|104 pages

Components of Big Data Analytics

chapter 8|38 pages

Data Collection and Data Storage

chapter 9|18 pages

Big Data Cleaning and Transformation

chapter 10|10 pages

Descriptive Statistics and Aggregation

chapter 11|32 pages

(Big) Data Visualization

part IV|64 pages

Application: Topics in Big Data Econometrics

chapter 12|20 pages

Bottlenecks in Everyday Data Analytics Tasks

chapter 13|10 pages

Econometrics with GPUs

chapter 15|18 pages

Large-scale Text Analysis with sparklyr