What Data Governance is Not


Before you turn your head and say “boring!” stop and take a deep breath and read on. It may save your company millions.

Data is a core strategic asset of any organisation and should be governed like any other asset. To give you a very book-ish definition, “data governance” is a control that ensures that the data entry by an operations team member or by an automated process meets precise standards, such as a business rule, a data definition and data integrity constraints in the data model.

Wow, that is a lot of words. When I tell this definition to my friends who ask me about my job, they get confused. So I had to think deep about how I could explain data governance to my six-year-old nephew.

Explaining to a 6-year-old (sort-of)
To put it in simple six-year-old language: Data governance is what you do when you need to have data that can be trusted, easily available, usable, integrated and secure.

If you end up in a situation where you ask yourself “Where is this data coming from?” or “I wonder if loss of this data is costing us money.” or “Where in the organization do I go to regarding this data?” or “Can we fix this data to have the correct calculations / definitions?”, then you need to have data governance.

What Data Governance is Not
The funny thing is, vendors actually began believing their own marketing and think of their MDM, Data Quality, Data Security, and Data lifecycle management products as data governance tools/solutions. At some point, Big data vendors jumped over their data management offerings altogether and just call their catalogs, security, and lineage capabilities as data governance. However, the catch is that you cannot simply take outputs from such technologies that are limited and painful to implement or even be understood by a business user and shove it under the “governance” carpet!

Data governance is not purely about data privacy, however it is about implementing the process around such concepts of data privacy/ security, inter-departmental data exchange agreements etc.

  1. Data governance is definitely not about data storage in a central repository. It frustrates me when even major corporations try to bring the two under the same definition. Storing data in a central repository or data lake is not governance. However controls around accessing and processing the critical data from that repository that makes sense to your organization and brings value, is part of data governance.
  2. Data governance is not silo-ed initiative: it is about bringing the IT and Business teams under the same room and talking about the initiative. You may face some challenges in IT middle management while implementing a data governance program, however once they see the facts of the value a data governance program can bring about in an organization, and in fact they can be the heroes of this program implementation in terms of practical aspects of it, then everyone is a winner!
  3. Data governance is not just to clarify who owns the data, but rather about ownership that controls data value.
  4. Data governance is not purely metadata management, however data governance uses metadata management to impose management and discipline on the collection and control of data.
  5. Data governance is not a pure IT project (project is a wrong usage of word tied to governance as it has a beginning and an end), but rather a business imperative: think of it as a business case which demonstrates how leveraging data enables organization, people, processes, and data assets and merge this with existing technology landscape for greater business value
  6. Data governance is not a one-time-project: it is a continuous iterative process that organization with data as their main asset should implement one-step -at-a-time and improve continuously based on a feedback loop. In fact, most organizations that successfully introduce data governance implement their programs incrementally.
  7. Data governance is not something that should be sold to management as a “people-behavior-governance”. Technology may tell people what to do, however the data does what you tell it to. Even though it is process-governance in a way, it should be sold as ‘formalizing processes’ and only then adding people’s workload to it.

Too often analysts spend their time aggregating data from different data sources, trying to reconcile duplicated, or incomplete data sets, or driving manual processes. Nowadays data governance tools can help you automate part of your data governance initiative. Now that you know parts of what data governance is not, there are many best case practices in the web out there which outlines what it actually is and how to implement the program. Do check them out and if you need to implement data governance in your organization, look into the market and pick best-of-breed.


More stories like this one

Jul 15, 2024 - 4 min read

How to observe data quality for better, more reliable AI

Read more
Jul 2, 2024 - 4 min read

Collibra AI Governance and de-risking unstructured data at Ohalo

Read more
Jun 27, 2024 - 5 min read

Defining responsible AI governance with UCLA Health

Read more