What is data governance? I’m sure we can all agree it is at least concerned with finding common definitions and at most with enterprise governance. Yet a quick search reveals that a wide range of definitions for data governance exists. An extremely valuable body of knowledge such as the Data Management Body of Knowledge (DMBOK) and various maturity models have empowered data management professionals to shape data governance strategically and operationally in complex organizations. Yet if we want all employees to be aware and eventually participate in data governance, then we need to translate this expert knowledge into something much simpler. I do not claim that I will provide all the answers in this post, but I will at least attempt to unleash a new debate for data democratization.
Five Misconceptions of Data Governance
As a scientist, my first attempt at defining something starts by excluding what it is certainly not or what is misleading about the definition. And what I’ve found is that many people misinterpret data governance. Last week, I attended the Association for Institutional Research (AIR) Forum and observed (again) data governance is commonly misunderstood. The Institutional Research community is traditionally not as exposed to the topic of data governance as frequently as the financial services or healthcare industries. Yet, it is a true ‘business’ community with members of academic affairs and institutions. These are data citizens who are dealing with the same problems we do.
Let’s look at five misconceptions of data governance that I often see, and why they are problematic.
- Data governance is a published repository of common definitions. This is an incomplete definition of data governance. Of course, a common glossary is a foundational component of many data governance initiatives. However, a repository is only trustworthy if a meaningful and transparent process and responsive ownership is in place to maintain it. Trust is an essential value to achieve democratic data governance.
- Data governance is a concern of – and hence managed by – IT. This definition excludes the business side of data governance. Indeed, IT plays a crucial role in the underlying identification of authoritative sources and verification of their lineage. Yet the business as a consumer has an inevitable role in the certification of the business context on the data assets you manage.
- Data governance is just data quality (DQ) and master data management (MDM). It’s true that data quality and MDM are data management activities that have to be governed. Yet DQ and MDM are about finding a mathematical truth for data in terms of quantifiable dimensions such as accuracy and completeness. Data governance goes beyond DQ and MDM by building trust in data that only human beings can qualify. Again, trust comes into the picture as an essential value in democratic data governance.
- Data governance is siloed by business function. Your organization may be extremely decentralized and geographically distributed. Yet that doesn’t mean you can’t establish a coordinated approach to data governance among autonomous sub-organizations. Many organizations that are decentralized and geographically distributed such as universities and global banks have successfully implemented a shared platform. Moreover, organizations can gain a competitive advantage by having a broader perspective on the business as a result of global data governance.
- Data governance provides no value or participation for the data-consuming community. This definition is clearly wrong. Self-service BI tools empower more and more consumers to also produce data and reports for their own applications. Data governance policies help define how confidential data can be used and how to ensure data security and quality. If trust is an essential value in the holistic governance of data, then it should be grounded in transparency and equal participation for all data citizens, which necessarily includes the consumers of the data. All together, they are your sentinels who can identify data issues in a more granular way which the traditional monitoring could not.
Collibra aims to achieve these core values of transparency, participation, and trust by building a non-intrusive user experience in its platform applications, specifically through a data helpdesk. In addition, through Collibra University’s learning and certification program, users gain the confidence of, ’We can definitely do this!’
Data Governance is About the People
We’ve looked at what data governance is not, but what, then, is data governance? Data governance is about people. It is about reducing a steep learning curve so you can answer the right data questions, and be appropriately involved in the process of clarifying the answers. Data governance enables all data citizens to have a holistic lens on their ever-expanding data universe.
Again, the scientist in me sees two complementary approaches which I believe can help you draft your data questions: ‘understand and explain’ and ‘monitor and predict’.
The first approach will provide an organizational understanding and set of explanations for your data universe in terms of:
- Scope: Define what data is relevant and what is not. Scoping restricts your approach, so it is important to broaden the scope iteratively as you create buy-in. E.g., critical data elements identification is a common scoping mechanism in financial organizations.
- Commonalities and differences: Identify definitions for your assets such as policies, rules, business terms, reports, data elements, and systems. Initially, people will provide varying definitions for commonly used assets, regardless of the scope. It is important to start with processes to converge and approve definitions for these assets, or to articulate relationships between them if they turn out to be conceptually different: e.g., ‘customer’ could be the name for a business term as well as a database table.
- Business traceability: Deploy process to establish traceability for every critical asset’s relationship with the business process, overall data quality, and involved critical data elements.
- Data lineage: Identify authoritative data sets, warehouses and lakes to articulate the transformations between them.
The second approach is to begin to monitor and predict the future impact of your processes around finding the above explanations for a wide variety of applications such as regulatory compliance, report certification and watermarking, helpdesk and issue management, data access and usage agreements, and many more.
A Platform for the People
In order to realize these ambitions, the data community needs a collaboration platform made up of three tiers:
- A platform that functions as an operating system. It defines the scope (what), who is involved, and how they are involved in terms of asset types, roles, and workflows. Depending on the applications required, the platform can be regarded as a sophisticated event processor that triggers a radar network of workflows prompting the right and timely steward actions and system manipulations.
- Steward applications that sit on top of the platform. These applications have a designated user experience for the most effective monitoring and predicting of the above-mentioned applications.
- Aligned metadata. All these applications should produce aligned metadata that taps – through the platform’s API – into the various data management applications ranging from data quality profiling to dashboarding.
Collibra checks all the boxes. Organizations across a variety of industries use the Collibra platform for data governance. With our purpose-built platform, you can activate one or more of our best practice data governance operating models that come out-of-the-box. Our Center of Excellence continuously publishes new models derived from successful industry cases. Collibra University envisions to build capacity through this knowledge that inspires strategic decisions and develops trustworthy practices in the data profession and citizenship. Take a look, for example at Collibra University’s course offering on Report Certification and Watermarking or Data Sharing Agreements.
Groundswelling Data Leadership
Ultimately, the Collibra platform bridges the void between technology and data leadership, joining the power centers within the organization. The Chief Data Officer (CDO) is the new leader in this data democracy and answers the call of many data citizens who have realized the way data is being handled needs to change dramatically. Management schools echo this call by not only justifying this new role, but also investigating how different types of CDOs become invaluable to organizations.
It’s clear we are at the forefront of a data revolution. Data consumers have had enough of the non-transparency or non-involvement in the very data (reports) they use to do their job. Now that a new call for leadership surges, the CDO can feed the momentum by providing a truly collaborative platform and change the way business is done.
Join our data revolution!
Pieter leads the company’s Research & Education group, including Collibra University, an online learning platform for data governance and data science education.