Imagine if Facebook, Twitter, and LinkedIn provided full-blown access to their data pipeline. How would you feel if a stranger saw your social media footprint, including your private photos? Maybe you would argue you’re not a social media person… but what if it was your banking transaction, medical history, or criminal record that was shared? Oh, wait! This is getting a bit complicated and sounds like a case of privacy violation, i.e., data usage without proper consent from the data owner. How can we stop this from happening? Only thing I could think of is a formalized process around data sharing. Let’s call it a “data sharing agreement” to help data consumers and owners responsibly manage data.
Thou shalt not worry, go on and read. Thou shalt find answers to the unanswered.
Many organizations like banks, hospitals, government, etc. store data about their clients in multiple systems and applications. From there, the lifecycle of data includes transformation, consolidation, standardization and reporting but in most cases, no one really knows about:
- Where the data is actually stored?
- What systems and applications are using the data?
- Who has access to what data?
- If I want to make changes to the data, who needs to be involved?
These questions are often unanswered, which highly impacts analytics, regulatory reporting, and many important aspects of decision-making within any given organization.
Organizations need to effectively manage its data as an enterprise asset and get everybody to reach consensus while dealing with data. It’s critical for an organization to serve its data citizens (business functions, departments, etc.) by providing right data for the right purposes in a timely manner and achieve operational excellence. One way to get there is to formalize the process of data sharing and usage. Meaning, to create a platform for data consumers and data owners to keep track of data sets, data ownership, policies, and standards describing what restrictions apply to data, and history behind who has access to what data and for what purposes. I’ll throw a time factor to the mix to ensure when users request for data it comes with an expiration date. This way I can automatically notify the system administrators to revoke data access when a data sharing agreement is expired.
In this way, data consumers can use data with owners’ consent. An added benefit you get is accountability, which is directly tied to working groups, data management individuals, collating cultural change (which, by the way, is one of the bottlenecks to establishing effective data governance programs). In essence, you’re linking the people, process, and technology aspect to build a formalized data usage process. Hence, making sure the right people get access to the right data at the right time. Do you agree? Or I should ask, what challenges do you see in making this happening at your organization?