Preserving Data Privacy(Introduction)
By Sowmya Ganapathi Krishnan and Siddharth Shah
Data Privacy. This is a term we keep hearing in everyday media; plenty of regulations that keep coming through to make sure that this is protected. In this blog, let’s explore what exactly is data privacy and how to ensure it’s handled well. Off we go!
You go online to book a flight (hopefully post-COVID!). You flesh out your details on the website, starting with your name, address, email, phone number, passport details, et al. You would automatically assume that these details are being kept safely and away from malicious hands. This is the first and foremost aspect of data privacy — personal data privacy.
Similarly, making sure that the financial, intellectual property and any other proprietary data in an organisation is protected is another facet of privacy — organisational data privacy.
Regulations like GDPR, PDPA, HIPAA and several others exist to make sure that data privacy is protected, assuring confidentiality and immutability of the data. At the end of the day, ensuring that sensitive data doesn’t get misused and identities remain protected, is the primary goal of data privacy.
Why is it important?
Every year, the number of data breaches are increasing at an alarming rate. Between 2019 and 2020, the world saw a whopping 273% increase in the number of exposed personal data records. An unprecedented breach happened just a few months back in Apr 2021 with personal data of 533 million Facebook users from 106 countries getting posted online for free in a low-level hacking forum. The data includes users’ phone numbers, full names, location, email address and biological information.
This breach cost Facebook over a billion dollars. That aside, imagine what kind of misuse might be happening with this information at this point!
Even if a dataset is anonymised, by combining it with other information, it could lead to exposure of personally identifiable information. One such instance is when Stanford researchers showed that they could uniquely identify 87 percent of the U.S population using only their zip code, gender and date of birth!
Should organisations stop collecting personal data altogether, then?
Lot of valuable transactions happen with personal data, starting with hospital registration, census statistics, personalised recommendations, finding cure for diseases…consequently, it becomes necessary to collect data to serve people better.
So how can we make sure that we’re still able to derive value out of data, without compromising on privacy?
Listed below are data computation techniques that preserve privacy while helping with meaningful processing of data.
Privacy Preserving Data Computation
There are several techniques in this space and in this blog series, we’re looking to cover some of the more prominent ones namely HomoMorphic Encryption and Differential Privacy.
- HomoMorphic Encryption (HME) — A class of encryption methods that allow computations to be performed directly on encrypted data.
- Differential Privacy (DP) — The idea behind differential privacy is that if the effect of making an arbitrary single substitution in the database is small enough, the query result cannot be used to infer much about any single individual, and therefore provides privacy.
Don’t feel intimidated by the sophisticated names! We intend to un-wrangle these interesting techniques for you. Check out HME and DP in the upcoming parts of this blog series.