chapter  16
Uncertain Data Classification
ByReynold Cheng, Yixiang Fang, Matthias Renz
Pages 27

In emerging applications such as location-based services (LBS), sensor networks, and biological databases, the values stored in the databases are often uncertain [11,18,19,30]. In an LBS, for example, the location of a person or a vehicle sensed by imperfect GPS hardware may not be exact. This measurement error also occurs in the temperature values obtained by a thermometer, where 24% of measurements are off by more than 0.5°C, or about 36% of the normal temperature range [15]. Sometimes, uncertainty may be injected to the data by the application, in order to provide a better protection of privacy. In demographic applications, partially aggregated data sets, rather than personal data, are available [3]. In LBS, since the exact location of a person may be sensitive, researchers have proposed to “anonymize” a location value by representing it as a region [20]. For these applications, data uncertainty needs to be carefully handled, or else wrong decisions can be made. Recently, this important issue has attracted a lot of attention from the database and data mining communities [3,9–11,18,19,30,31,36].