Data privacy is the beast that everyone is trying to wrestle right now. One of the most talked about and controversial incidents involving the misuse of user data recently comes from Facebook. As of the social network’s fourth quarter of 2017, 2.2 billion monthly active users were reported, of which as many as 87 million had their detailed personal information harvested by voter-profiling company Cambridge Analytica. Though the exact number of users who have had their information shared with this third party is currently unknown, and probably will remain as such, the need for a better method of managing and protecting user data privacy is indisputable.
But when the data in question are in the billions of records, how is it even possible to do so? In Part I of this blog series, we’ll examine this issue and the extent of the implications; we’ll review the solution in Part II.
Data privacy in itself isn’t particularly difficult to achieve; it’s simply a matter of managing users’ personal information and the entities that have access to it. It’s probably safe to bet that the Facebook of 2010 could have done this easily, considering their user base at that time was a fraction of its current size. When companies like Facebook are managing data records of hundreds of millions and even billions, user information is at risk of being harvested and used improperly. It becomes nearly impossible to keep track of every third party that is accessing the data, let alone apply protection criteria.
The dilemma can be broken down simply if you think of those online applications as villages. Less than a few decades ago, all of these applications would have been considered small villages. Governance is pretty easy in villages; the villagers know each other, where each others’ houses are, who their family members are, what they do for a living, what they like to do as a hobby, etc. In general, the villagers themselves don’t abuse this information and they are able to keep track of visitors and who inquired about whom; if one does abuse or misuse the information, it’s very easy to find out who the culprit is.
For this reason, when Facebook was still a small village (relatively speaking), it was a much easier task to track the various entities and third parties that had access to user data. However, Facebook has rapidly grown into a global metropolitan city that has incorporated every small village around it. In this global city where you only know a very small percentage of the population, and in an environment full of strangers where we don’t know who is accessing and abusing our own personal data, we are left completely vulnerable to this mass of third parties that is simply too large to track until it’s too late.
Not only has the size of the villages increased, but the number and types of environments in which they exist have expanded exponentially. The data that used to only reside on social media networks or a small class of websites is now being captured and shared by virtually every website you visit. Not to mention the advent of the Internet of things (IoT) which has brought ordinary items into the online realm (think: watches and fitness trackers), also collects and shares personal data like your eating habits and GPS locations.
A step above IoT is biometric data. This is any data that is collected during a biometric process including samples, fingerprints, retina scans, facial recognition, similarity scores, etc. Biometrics are now used to do simple tasks like unlocking a smartphone, opening a banking app, using Bluetooth functions in a car, and more.
A newer use of biometric data and one that I personally experienced recently is airport security and boarding. I was preparing to board a flight out of Heathrow Airport in London when it was required that I provide my biometric data in the form of a facial scan in lieu of my boarding pass. There was no alternative option and the passengers weren’t informed of this until boarding began, leaving little to no opportunity to ask questions regarding the usage of the data.
After this experience, I figured I couldn’t be the only person concerned about this issue, so I decided to post a question in an online discussion group to survey others’ opinions. To my surprise, only two people out of the 48 in the group responded, and of those two people, the issue didn’t raise any notable concern. The lack of response leads me to believe that most of the population doesn’t really care about how their data is being used and therefore is willing to provide their personal information (including their biometric data) to any and all interested parties. Perhaps this group of people subscribes to the common misconception that if you don’t have anything to hide, you don’t need to worry about data privacy. Or, this lack of interest implies that the general public simply isn’t aware of how their personal data is being used, which is perhaps even more detrimental than the latter. How can we demand privacy if we don’t even understand what is going on?
The need for data protection applies to everyone. Private organizations are as much responsible as public institutions to provide that transparency to the users on what they are using the personal information for and why, with whom it is shared, and how it is used by the parties involved.
As the growth rate of data continues to soar, it’s up to us to find a way to make sure that our data is not used improperly and that our personal information remains protected. Stay tuned for Part II of this series where we will outline a step-by-step process of how data governance can help organizations solve their data privacy, transparency, and protection challenges.
Ram is responsible for fast-tracking Collibra engagements and centralizing data governance thought leadership to influence product features.