ABSTRACT

E-science is an interdisciplinary scientific paradigm that uses large-scale IT infrastructure to process big data. Data science is, therefore, becoming an essential element for all modern interdisciplinary research, to facilitate collaborative scientific discovery and involve the whole life cycle of data analysis. A person possessing a skill-set of one or more disciplines for building efficient, distributed, and scalable algorithms is called a data scientist. A data scientist is typically a domain expert and must have the capability to understand the nature of data and find suitable algorithms for specific problems. A data scientist creates insights, while a data engineer creates things. This job role is all about working in the data-center and handling storage devices for gathering managing data. Content-based analytics focuses on the data posted by users on social media platforms. It is possible to perform text analytics, audio analytics, video analytics, etc. from such social media data, which is often voluminous, unstructured, noisy, and dynamic.