AI Model Anonymization

The EU General Data Protection Regulation (GDPR) and other similar regulations set out many restrictions on the processing of per- sonal data. Similar laws and regulations are being enacted in additional states and countries around the world. Adhering to these regulations can be a complex and costly task.

Many data processing tasks nowadays involve machine learning (ML). In recent years, several attacks have been developed that are able to infer sensitive information from trained models, including membership inference attacks, model inversion attacks and attribute inference attacks. This has led to the conclusion that machine learning models themselves should, in some cases, be considered personal information, and therefore subject to GDPR and similar laws.

However, these regulations specifically exempt anonymized data. Recital 26 of GDPR states that the principles of data protection should not apply to personal data rendered anonymous in such a manner that the data subject is no longer identifiable. It is therefore desirable to be able to anonymize the ML models themselves, i.e., ensure that the personal information of a specific individual that participated in the training set cannot be re-identified.

One option is to apply k-anonymity to the training data and then training the model on the anonymized dataset to yield an anonymized model. But Past attempts at training ML models on anonymized data have resulted in very poor accuracy. At IBM we have developed an anonymization method that is guided by the specific ML model that will be trained on the data. We use the knowledge encoded within the model to produce an anonymization that is highly tailored to the model. We call this method model-guided or accuracy-guided anonymization. It is based on knowledge distillation from the target model to an anonymizer model, that then determines groups of k or more records that should be generalized in the same manner to achieve k-anonymity. Once those groups are determined, each group is mapped to a representative value, which is the point closest to the median of the cluster from the points with the majority label in that cluster.

This approach outperforms state of the art anonymization techniques in terms of the achieved utility, as measured by the resulting model’s accuracy. In our paper we show that our proposed approach can preserve an acceptable level of accuracy even with fairly high values of k and larger numbers of quasi-identifier attributes, and even prevents different types of inference attacks, making anonymous machine learning a feasible option for many enterprises.

Our approach is generic and can be applied to any type of ML model. Since it does not rely on making modifications to the training process, it can be applied in a wide variety of use cases, including integration within existing ML pipelines, or in combination with machine learning as a service (ML-as-a-service or MLaaS in short). This setting is particularly useful for organizations that do not possess the required computing resources to perform their training tasks locally. It can even be applied to existing models, reusing the same architecture and hyperpa- rameters and requiring only retraining the model. Similar to classic k-anonymity methods, our method can only be applied to structured data, including numeric, discrete and categorical features, not image data. It also does not work well for very high-dimensional data, or data with a very high degree of uniqueness.

For more details on the method itself, see our paper: Anonymizing Machine Learning Models.

The code is available in our open-source project ai-privacy-toolkit.

Abigail Goldsteen, IBM