Welcome to the Collibra Blog, where CDOs, data stewards, and data citizens go to learn about true data governance.


The Struggle to Attain Reference Data Commonality

Share: Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someone

The Struggle to Attain Reference Data Commonality

When it comes to data governance and data management, commonly used reference data is of key importance. So what exactly is reference data? For the financial services industry, this could mean standard country codes, currency codes, numeric codes, product codes, etc. For the healthcare and pharmaceutical industry, this could mean codes/code sets related to ICD9, ICD10, specialty, medication, and more. But in general, think of reference data as a set or list of industry standard code values used across an organization for multiple purposes, mainly reporting and analytics. Reference data is typically managed by the business and IT to ensure it consistently and accurately represents core business entities of the organization including customers, products, citizens, suppliers, sites, hierarchies, chart of accounts, and more. Proper management of reference data has many benefits, for example, sentiments that I have expressed on my Twitter feed:

Reference Data Twitter Post

Traditionally speaking, reference data is often siloed, meaning it is spread across multiple systems and applications with no single holistic view showing connection between core business entities and different reference data representations. This leads to inconsistencies where users find it difficult to understand what codes fit for what purposes, and what systems use what codes/code sets. And, it requires, a lot of manual hard coding, i.e. linking code values to valid business entities that end up in valuable reports, susceptible to poor decision-making. In order to resolve this problem, organizations start out by implementing a master data management (MDM) solution. The idea is to rationalize reference data from multiple systems and create a single source of truth. Though MDM solutions are good at pulling all reference data into one place, they fail to provide the control and the collaboration piece around managing changes. The changes might come in from different source systems, applications, or manual invention.

Once bitten, twice shy …

So, do we start out with MDM first? NO! Why? Because MDM can be problematic without proper ownership and control. Changes made by users, systems, and applications are hard to track and will lead to weeks of workload just to figure out what changes were made to what data in what systems. Honestly, it’s a never ending cycle. Hard lessons learnt, many organizations are now realizing that data governance is the Holy Grail for making MDM a reality. I will go a bit further and you can quote (or tweet!) me later “there is no MDM without data governance.” Let me explain.

Data governance provides a formalized process to define stewardship and control around the data, which is extremely complementary to an MDM solution. In this way, organizations can establish a control point around reference data. Meaning, any change that happens around code values or code sets goes through a structured process where right people are involved when needed. This can be done in an application like Collibra, a leading, purpose-built data governance platform. For many organizations, Collibra is an authoritative source for proposing codes/code sets, managing changes through approval workflows, and involving other users to collaborate, vote, or comment when required. This creates the trust element and importantly a sense of ownership, which leads to adoption of best practice i.e. using data, that is trust worthy and approved by key stakeholders. Hence, an end to the struggle of attaining reference data commonality.

So, how are you doing this at your organization?

Kash is a Sr. Product Specialist at Collibra. He is involved in research, development, and delivery of enterprise data governance solutions. He is also an instructor for Collibra University, a worldwide community of 1000+ users. Prior to joining Collibra, he was a researcher at University of Arkansas for Medical Sciences, a leading cancer research institute. Kash also has a masters degree in Information Quality from University of Arkansas for Little Rock in collaboration with MIT. He has published his research findings at ICIQ, SE Regional IDeA and ITNG conferences.