ABSTRACT

Data scientists are people who are interested in converting the data that is abundant into actionable information that always seems to be scarce. Computer science is more than just programming; it is the creation of appropriate abstractions to express computational structures and the development of algorithms that operate on those abstractions. Among the tasks that need performing are data cleaning, combining data from multiple sources, and reshaping data into a form suitable as input to data-summarization operations for visualization and modeling. Over the centuries, as data became larger, machines were introduced to speed up the tabulations. Data science is based on the idea that these styles of thinking support each other. The goals of data scientists and statisticians are the same: They both want to extract meaningful information from data. For data scientists of all application domains, creativity, domain knowledge, and technical ability are absolutely essential. The chapter also presents an overview of the key concepts discussed in this book.