Anonymization Operations | 11 | Introduction to Privacy-Preserving Dat

ABSTRACT

The raw data table usually does not satisfy a speciﬁed privacy requirement and the table must be modiﬁed before being published. The modiﬁcation is done by applying a sequence of anonymization operations to the table. An anonymization operation comes in several ﬂavors: generalization, suppression, anatomization, permutation, and perturbation. Generalization and suppression replace values of speciﬁc description, typically the QID attributes, with less speciﬁc description. Anatomization and permutation de-associate the correlation between QID and sensitive attributes by grouping and shufﬂing sensitive values in a qid group. Perturbation distorts the data by adding noise, aggregating values, swapping values, or generating synthetic data based on some statistical properties of the original data. Below, we discuss these anonymization operations in detail.