Today we're releasing django-GDPR-assist - a Django app we've developed and have been using internally to help us and our clients manage personally identifiable information (PII) held in Django models, and to simplify processing GDPR requests for export or deletion of private data.
This is where Django-GDPR-assist aims to help. It works by looking for a
PrivacyMeta object defined on your model, which you can use to describe which fields contain PII, and to control how anonymisation and exports should work on your model's fields.
It has also been designed to work with third-party modules, where it is often impractical to make changes to the model code;
PrivacyMeta definitions can be registered against models from elsewhere in your project, and signals can be used to propagate anonymisation and deletion operations.
Once registered a model can be anonymised, where fields with personal data will be either nulled, blanked, or filled with default information, depending on the field's settings.
ForeignKey relationships can be defined with
on_delete=ANONYMISE to allow automatic anonymisation of the object when its target relation is deleted. This makes it straighforward to anonymise all related records when a user's account is deleted - for example:
class MyModel(models.Model): user = models.ForeignKey( settings.AUTH_USER_MODEL, blank=True, null=True, on_delete=gdpr_assist.ANONYMISE(models.SET_NULL), ) display_name = models.CharField(max_length=255) public_data = models.TextField() class PrivacyMeta: fields = ['display_name'] def anonymise_display_name(self, instance): return 'Anonymous user'
There is also a tool in the admin site for your staff to manage GDPR export and deletion requests - all registered models can be searched from a single form, and objects can then be exported or anonymised in one go.
One potential GDPR issue is anonymisation of database backups. To assist with this, deletion and anonymisation of objects in registered models is logged to a separate database which holds no PII. This log database is independent of the main database so that if you ever need to restore from a backup, you can use the management command
manage.py gdpr_rerun to run through the log database and re-apply any GDPR removal requests which came in since the backup was taken. When combined with a rolling backup period of 28 days or less, you can be confident you won't be left holding old PII data which you should have removed.
The GDPR also discourages having unnecessary copies of data, which can present an issue during development; we find it useful to sometimes run tests against copies of a production database to either debug production issues or just see how changes work with real-world data. In these situations to avoid potentially holding unnecessary copies of PII data, we can use the management command
manage.py anonymise_db to scrub all PII.