ABSTRACT

We carry out analyses on noisy data where the noise has been added deliberately, to protect confidentiality. The particular set of confidentialized dataset that we examine is data on migration between regions of Iceland, by one-year age group, by sex, by origin, by destination, and by time.

Statistics Iceland in fact publishes unconfidentialized migration data. We randomly round the values ourselves, which allows us to see what data look like before and after rounding. Random rounding produces a few distortions in the data for the large migration flows, but strips almost all the information out of the data for small migration flows.

We construct a Bayesian model containing within it a system model and a data model. The system model describes patterns in the true, unconfidentialized migration counts. The data model describes the relationship between the unconfidentialized counts and confidentialized counts.

We jointly estimate true migration counts and rates. We compare the results from our main analysis with an alternative analysis, in which we treat the confidentialized data as noise-free. We construct forecasts for the true, unconfidentialized migration counts, and for the confidentialized migration counts.