ABSTRACT

In information technology, Big Data is a collection of data sets so large and complex that it becomes difcult to process using on-hand database management tools or traditional data-processing applications [74]. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared with separate smaller sets with the same total amount of data [26,51,68]. Scientists regularly encounter limitations due to large data sets in many areas, including meteorology, genetics, complex physics simulations, and environmental research [9,35]. Wireless technology-based automated data gathering from the large environmental sensor networks have increased the quantity of sensor data available for analysis and sensor informatics. Next-generation environmental monitoring, natural resource management, and agricultural decision support systems are becoming heavily dependent on very large scale multiple sensor network deployments, massive-scale accumulation, harmonization, web-based Big Data integration and interpretation of Big Data. With large amount of the data availability, the complexity of data has also increased hence regular maintenance of large-scale sensor are becoming a difcult challenge. Uncertainty factors in the environmental monitoring processes are more evident than before due to current technological transparency achieved by most recent advanced communication technologies [47-49]. The other challenges include capture, storage, search, sharing, analysis, and visualization. Data availability from a particular environmental sensor web is often very limited and data quality is subsequently very poor. This practical limitation could be due to difcult geographical location of the sensor node or sensor station, extreme environmental conditions, communication network failure, and lastly technical failure of the sensor node. Data uncertainty from a sensor network makes the network unreliable and inefcient. This inefciency leads to failure of natural resource management systems such as agricultural water resource management, weather forecast, crop management including irrigation scheduling and natural resource-based crop business model systems. The ultimate challenge in environmental forecasting and decision support systems, is to overcome the data uncertainty and make the derived output more accurate. It is evident that there is a need to capture and integrate environmental knowledge from various independent sources including sensor networks, individual sensory system, large-scale environmental simulation models,

15.4.5 Unsupervised Knowledge Recommendation .................................... 483 15.4.5.1 Time Series Integration ...................................................... 483 15.4.5.2 PCA-Based Feature Space Representation ........................484 15.4.5.3 SOM-Based Big Knowledge Recommendation .................487 15.4.5.4 Knowledge as RDF on LOD ..............................................490

15.5 Conclusion .................................................................................................... 491 References .............................................................................................................. 491

and historical environmental data [13,33,36,41,56,58] for each of the independent sources). It is not good enough to produce efcient decision support system using a single data source. So there is an urgent requirement for on demand complementary knowledge integration where different sources of environmental sensor data could be used to complement each other automatically [1,17,37,38].