ABSTRACT

The data gathered from IoT sensors are highly unreliable due to factors like missing values, corrupted values, redundant data, noisy data, and outliers. Hence, the traditional methods for data storage, data access, and data processing are not efficient to handle such complex data. Moreover, the traditional methods are limited in terms of scaling, performance, availability, robustness, and energy efficiency. In fact, the naive techniques of embedding machine learning algorithms into locally stored data for processing will adversely affect the global correlation of IoT things and robust decision-making for IoT applications. In addition, the concept of IoT requires context-aware, cognitive, and predictive capability incorporated into IoT applications to discover useful information based on the IoT data gathered. Further, the complexity in data analytics involves real-time, offline, virtual, memory level, business intelligence or intensive analytics. There is a need for scalable data algorithms and mechanisms to produce efficient and real-time results. Based on the issues discussed above, this chapter aims at analyzing the various solution approaches from three perspectives namely data storage and access, data processing, and data analysis mechanisms. The data storage and access mechanisms target big data, distributed storage systems, NoSQL databases, data integration, and data virtualization for the IoT ecosystem. Next, this chapter aims to handle different types of data such as heterogeneous data, dynamic data, and weak semantic data gathered from IoT systems. Then, data processing mechanisms like data processing, data quality and knowledge discovery tools on the IoT data gathered are analyzed. Finally, data analysis mechanisms such as predictive analysis, stream analytics, and machine learning algorithms are explored.