ABSTRACT

Modern Data Science with R is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world problems with data. Rather than focus exclusively on case studies or programming syntax, this book illustrates how statistical programming in the state-of-the-art R/RStudio computing environment can be leveraged to extract meaningful information from a variety of data in the service of addressing compelling statistical questions.

Contemporary data science requires a tight integration of knowledge from statistics, computer science, mathematics, and a domain of application. This book will help readers with some background in statistics and modest prior experience with coding develop and practice the appropriate skills to tackle complex data science projects. The book features a number of exercises and has a flexible organization conducive to teaching a variety of semester courses.

part |146 pages

Introduction to Data Science

chapter 1|8 pages

Prologue: Why data science?

chapter 2|24 pages

Data visualization

chapter 3|30 pages

A grammar for graphics

chapter 4|28 pages

Data wrangling

chapter 5|40 pages

Tidy data and iteration

chapter 6|16 pages

Professional Ethics

part |94 pages

Introduction to Data Science

chapter 7|22 pages

Statistical foundations

chapter 9|16 pages

Unsupervised learning

chapter 10|20 pages

Simulation

part |174 pages

Topics in Data Science

chapter 11|18 pages

Interactive data graphics

chapter 12|40 pages

Database querying using SQL

chapter 13|16 pages

Database administration

chapter 14|38 pages

Working with spatial data

chapter 15|22 pages

Text as data

chapter 16|24 pages

Network science

chapter 17|14 pages

Epilogue: Towards “big data”