ABSTRACT

Let us begin with a small thought experiment. To begin with we should recognise that it is highly unlikely that any data set is 100 per cent accurate. Then, suppose in the example landscape we have been using in previous chapters (e.g. Figure 2.5), each layer is 90 per cent (0.9) accurate. If we were to combine in an overlay the geology and the land cover, we would end up with a map that is [0.9 AND 0.9] accurate, which in probability terms would be 0.9 X 0.9 = 0.81, or 81 per cent correct. Add another layer to the overlay and the result might theoretically be only 73 per cent correct; and by the time we have used seven different layers in the analysis our output product might be less than 50 per cent correct! What then if this final map was used as input data for an environmental simulation__ Of course, things are unlikely to be quite this bad in practice and besides, plenty of errors (often un-noticed) were made in using traditional paper maps. Nevertheless, a good understanding of data quality issues is a key to informed use of GIS and environmental modelling. This chapter will tend to focus on issues of spatial data quality as these pose special problems in addition to those encountered in non-spatial data. As we saw in Chapter 3, spatial data quality is a fundamental concern of GI science. Whilst consid­ erable research is ongoing in this area, there is already a sizeable literature. For greater detail than provided here, the reader can refer to Goodchild and Gopal (1989), Burrough and Frank (1996), and Burrough and McDonell (1998) for GIS perspectives, Heuvelink (1998) for a GIS and environmen­ tal modelling perspective, Li et al. (2000) for a process model perspective and Elith et al. (2002) for an ecological perspective.