ABSTRACT

Many data cleaning problems arise from how data in different schemas interact with each other or map to a common view. These problems affect the retrieval recall and precision of a data repository, its ability to interact with other data repositories and the repository’s ability to incorporate or modify the model of the data. Whether the cleaning problem arises from one of these problems or other problems where ultimately, data from one set must be mapped to another, the situation can be viewed as a data integration problem.