ABSTRACT

What is data science? The simplest definition to come up with is, it is the study of data. The real-world data is raw and data science uses tools and techniques to extract meaningful information from the raw data. It incorporates different fields such as statistics, mathematics, computer engineering, machine learning, data miming, and artificial intelligence to analyze large amount of data. Nowadays there are various applications available to automatically capture and store this large amount of data such as online system and payment portals. Organizations are overwhelmed with this huge data and want to make inferences as to enhance business and productivity also to give users a better experience. Data science is helping to reveal gaps and uncovering new patterns take it from health, medicine, finance to e-commerce. Data science is a broader term, which consider multiple challenges, such as capturing, cleaning and transforming data to finally make inferences from it. Whereas data mining is mainly about extracting knowledge and unknown patterns from the huge amount of data hence it is also called “knowledge discovery process”. On the other hand, machine learning is an automated technique which uses complex algorithms for data processing and providing trained model output or we can say that it is a technique to train model on the given data and make predictions. Artificial Intelligence goes one step ahead and uses machine 44learning algorithm to make intelligent systems which can work on their own. These techniques have made data processing faster and much more efficient. It is because of different expertise required in this field, data science is showing strong growth.