ABSTRACT

This chapter gives a brief introduction to data science, big data, and data analytics. Volume, velocity, variety, veracity, and value are 5V’s of big data mentioned in the chapter. The chapter also discusses the data analytical life cycle, which is referred to while solving the big data and data science problems. The six phases are data discovery, data preparation, model planning, model building, communicate results, and operationalization. While following these steps, the data scientist team has to perform various operations to gain more insights and knowledge from the data. While mentioning general information about the models, this chapter also discusses three phases of ETL, which are extract, transform, and load. Furthermore, this chapter also focuses on how to deal with a huge amount of data in which data science tools can be used for this purpose. The R, Tableau, and Statistical Analysis System (SAS) tools are used in the model planning phase.