ABSTRACT

Data analytics and their utilization in big data environments witness a rapid growth in the past few years. Several undesirable side-effects have appeared in relation to data disclosure and privacy violations risks. This trend imposes finding privacy methods with a scale-up ability to cope with the big data growth. Data anonymization is one of the pioneer privacy solutions that can minimize such risks. However, the current anonymization solutions suffer from poor performance and high loss of gained information in the big data environment. In this paper, we propose a novel privacy method named as multidimensional sensitivity-based anonymization. The method resolves the performance and anonymization loss concern and provides a multilevel access control. Many privacy methods were proposed to anonymize data before exposing sensitive information on the cloud. The contemporary anonymization methods do not take the big data processes into considerations. In this paper, we compare our proposed method with one of the recently proposed methods known as multidimensional top-down specialization (TDS). The comparison shows limitations and contamination for the big data structure on applying the TDS method; in contrast, our proposed method adopts the parallel distrusted structure during the anonymization operation. Our method provides a gradual access framework for analysts who wish to participate in data analytics in big data. The framework is integrated with role-based access control that maps the authorization roles between service providers and federation services. Sensitivity-based anonymization discriminates data refinement by providing multiscalable levels of user's access.