ABSTRACT

There is a rich history of collecting environmental data and recently there has been an explosion in quantity and complexity of data related to the physical and natural world around us, from monitoring, satellite remote sensing, climate modeling, social media and contributions from citizen science. This chapter shows how R can be used to process data to ensure it is in a suitable form for analysis, visualization and modeling. Data is increasingly available from Internet sources, including as the GitHub repository. Using RStudio we can import datasets using the ‘Import Dataset’ option in the ‘Environment’ window. It is estimated that data scientists spend around 50–80% of their time cleaning and manipulating data.