Data governance and data quality have traditionally been separate disciplines. However, at a recent data governance conference, a keynote speaker made a point that resonated with many attendees: talk about data governance as a way to achieve data quality.
An organization might start looking for a data quality tool because analytics and BI projects are affected by poor data quality. Or, a regulation such as CCAR, Solvency II, BCBS 239 may be pushing the business to get a grip on the correctness, completeness, and accuracy of data.
However, buying a data quality tool first is like booking a surgery before being diagnosed. Organizations have thousands and thousands of different data elements. Which ones should they focus on? Which ones could be left out of scope? Which ones are making the greatest impact on the business and should be managed first?
Two approaches to answer the above questions are:
- Critical data elements: identify what is critical for the business. This could be a regulatory report, a cube, a KPI, or anything else that is key.
- Data value: estimate the cost of poor data quality or, in other words, the risk associated with bad quality. Focus first on those areas with the highest risk.
In both cases, once the areas of focus have been detected and prioritized, data governance creates a collaborative framework for managing and defining policies, business rules, and assets to provide the necessary level of data quality control.
Data owners can define key systems and processes involved. At the same time, the business can state what standards the data should adhere to when it moves through the systems. This is where policies, requirements, and business rules are created and agreed on.
Once we know how the data flows through the organization and we know what the standards are, then it is straightforward to ask the data quality team to translate these standards into data quality profiling rules and run them on the data in those systems.
Now, we have pinpointed exactly where we should focus our data quality efforts (rather than trying to profile everything).
The next step will be to link the metrics coming from one or more data quality tools to the critical business assets so the business users can see exactly what the status is and where they’re lagging.
The screenshot above shows such data quality dashboard. The red X’s mean that the thresholds were not met for one or more checks.
For data to be used, it needs to be trusted. Having a simple and direct way to identify data errors and see them quickly resolved is essential to maintain (and/or restore) that confidence.
A data helpdesk is a key component of data service management, a high-level maturity state in which all employees of an organization have access to one central place where all data is documented. It enables organizations to handle all data related issues in an efficient way, using the data governance organization, roles, and responsibilities.
In the scenario depicted in the screenshot, this means that once a threshold is broken a data issue would be automatically generated and would involve all required stakeholders so a correct data issue remediation process can be started.
What are you doing to create a culture of confidence in your organization?
Laura is a sales engineer that helps organizations instill trust in data. Prior to Collibra, she worked as presales at Denodo, where she specialized in enterprise data integration.