Take for instance, the public sector, which is growing extensively connected. This could be because numerous government bodies share citizen data for civilian convenience, analytics and cost-control.
To further complicate matters, GDPR compliance is creating huge challenges in these connected environments. For example, if PID (Patient Identifiable Data) is shared across departments or organisations without any governance, there are obvious risks around not having a common view of controls, which lead to data leakage risks.
Even if controls over data leaks existed, there could be risks of the data being inadvertently processed in a way that invalidated the original consent. GDPR changes the consent rules, so that it is now explicitly acquired for a specific processing purpose.
I’d like to share some of my thinking here around the specific challenges of the ‘Right to Erasure.’ According to the ICO, many things can trigger an enquiry for a subject to request that their data and, conversely, many reasons to deny that request. I’ve been thinking about the technical challenges on the assumption that the request is valid, and no valid reason exists to not comply.
Maintaining Referential Integrity
Is there a risk that data which has been shared across two organisations using a linked identifier could become an issue when one of those organisations has to delete their records? What if that organisation was the ‘master’ record for the other organisation to receive updates from?
Pseudonymisation is possibly part of a good technical response to this scenario. It is where the master record could store unique pseudonymisation keys for data subscribers. So, when the subscriber has to delete their local record, their pseudonymisation key is deleted too.
My overall point here is that there’s a lot of thought that must be put into tracing where data has come from and where it is going in such an environment. This will help with:
- Planning how to delete data
- Demonstrating that the request has genuinely been complied with
- Not intentional separation from other organisations and their environments
Logical delete vs. Physical delete
I have implemented solutions in the past for extremely large data stores. These interconnected with many other systems and organisation functions, and sidestepped the issue of the technical problems. This can be done by simply removing records and using the logical deletion concept.
In its simplest form, you simply leave the linked identifier/key in the system, but overwrite the data associated with that key with blank data (or data representing ‘blank’). This scenario is fine if you’re dealing with more traditional data stores such as file systems, databases.
Blockchain to the Rescue
At Mastek we’ve been studying how blockchain concepts can help with the ‘know your customer’ use cases. And we’re not alone in this thinking. Our study is based on multiple data subscribers to a distributed record. This way, a customer only has to maintain their customer details in one place and all the third parties whom they interact with can just read it from the common record.
However, there’s no easy solution to deleting data from a chain. In fact, that’s one of the points of using techniques such as blockchains to store distributed data using consensus principles. Many ideas have been put forward around how this can be achieved. However, ultimately all are reductions of the functionality or point of the blockchain.
It’s no good creating a new ‘blank record,’ as the history is still there to retrieve the last non-blank record. Another idea that comes to mind is to use the blockchain as a history of links to other systems where the data is stored. In such scenarios, you’ve got ask yourself what the point of the blockchain was in the first place.
You could also argue that storing a history of a citizen’s data in a blockchain is a bad idea, as the intent of the chain is to be immutable and GDPR mandates that that is incorrect - as you must be able to remove data.
Do you have any other suggestions that can help with overcoming issues in extensively connected environments? Let’s connect through the feedback section below or at [email protected]