ABSTRACT

In the previous chapter, we have studied the data publishing model where multiple data holders want to collaboratively anonymize their vertically partitioned data. In this chapter, we study the problem of collaborative anonymization for horizontally partitioned data, where multiple data holders own sets of person-specific data records on the same set of attributes. The model assumes that the sets of records are disjoint, meaning that a record owner appears in at most one record set. Often there is a strong urge to integrate scattered data owned by different parties for greater benefits [131]. A good example is the Shared Pathology Informatics Network (SPIN) initiated by National Cancer Institute.1 The objective is to create a virtual database combining data from different healthcare institutions to facilitate research investigations. Though the virtual database is an integration of different databases, however in reality the data should remain physically in different locations under the complete control of the local healthcare institutions.