ABSTRACT

It can be argued that conventional data mining tools can be usefully applied to mine GIS databases to extract pattern in the same way that conventional statistical methods can be applied to spatial data. There are some geoinformational data mining tasks that may be usefully performed by conventional data mining software. Table 7.1 outlines the range of tools that most data mining packages offer and many of these methods could be usefully applied to spatial data. For example, data reduction tools, such as multivariate classification, can be useful as a means of summarising the essential features of large spatial data sets; for instance, to create geodemographic classifications. Similarly, modelling tools such as neural networks and decision trees can be readily applied to some geographic problems. It can be argued that whilst these methods ignore all of the special features of geographical data (see table 7.2), they still ‘work’ to some degree but there are also many exploratory geographical analysis types of data mining task that seemingly cannot be performed. However, there is a major potential problem in that the use of conventional data mining tools implies acceptance of the key assumption that geographical data are the same as any other data, i.e. that is there is nothing special about geographical information, or indeed geographical analysis, that will prevent it being performed by conventional methods. These packages can only treat the X, Y coordinates as if they were merely two ordinary variables (such as age or income) and it is very likely that nothing useful will be achieved. There is no mechanism for handling location or spatial aggregation or for coping with spatial concepts or even mapping.