3 Aspects of a Data Governance Benchmark
Laggard? Leader? Ammo for business case? Changes in approach or strategy? Whatever the reason, people want benchmarks and we get asked about them all the time. I think its probably because a data governance benchmark is an important touchstone on the governance journey that provides people with perspective and confidence that says ‘you’re not alone, crazy or both.’
As the market leading data governance platform with more implementations across an array of industries than all competitors combined, Collibra is in a unique position to do something about this. We have a critical mass of data sources that we can use to begin to craft and provide a meaningful benchmarks that can be used to provide guidance for our customers. So it will come as no surprise that we are working on exactly that.
What do you measure?
The word benchmark is a loaded term and one which is probably a good candidate to be governed itself. What, exactly, should be benchmarked? The one universal assumption is ‘companies like mine,’ but we all know that is like snowflakes, with no two being the same. Our perspective is to use three broad categories: Counts, Adoption, and Acceleration.
- Counts – These are basic counts of objects in the governed environment such as assets, users, workflows, domains, communities, use cases implemented, governance roles, etc.
- Adoption – These are also counts and measures, but are related to activity and therefore provide an indication of adoption. Examples are ratio of read/browsing users to licenses purchased, ratio of editor/create users to license purchased, number of workflow tasks, support ticket volume, community question and answer participation, beta program participation, users registered for self-paced Collibra University training courses, and users who completed University courses and earned certifications.
- Velocity – This introduces the dimension of time to all the counts and adoption metrics and shows us if adoption is accelerating or decelerating. Some examples include time to first go-live, average time to Collibra course completion, frequency of support tickets, license usage over time, number of users with Collibra certificates over time, etc.
How do you measure?
With a classification of counts and measures defined, how exactly do we go about measuring them? The first challenge is gathering the data. It would be great if we could simply send out a survey and have everyone respond, but that is not realistic. To do this right, we need details at reoccurring frequency that no one will have time to provide. Enter the cloud and an integration effort. With the majority of our customers either already on or moving to a cloud implementation, a lot of usage data can simply be queried from each customer instance and stored anonymously. Unfortunately, that is not where the challenge ends. Adoption data must also be collected from our support system, learning management system, and community system, and then associated to the master account and usage data. Our plan is to design a data governance benchmark data mart and populate it on a nightly basis.
The second challenge is how to aggregate and score the data once it’s collected. From my perspective, the most interesting thing is deciding how to group the data. We are planning to start with basic categories such as community engagement, learning engagement, breadth of use, license consumption, support engagement, asset volume, role counts, and others. Thankfully, computing the values for each and comparing them against the population of accounts is a basic statics problem once all the data is in hand.
Of course the final piece of the puzzle is to provide access to the data so people can make use of it. We plan on providing two levels of access. The first will be a simple dashboard with the ability to change some simple filters like industry and use case. The second is direct access to the data mart via any BI tool of choice for those who are more inclined to do their own analysis. Its also possible we will create a survey that compares the users answers against the benchmark data.
What a data governance benchmark can’t tell you
Obviously, a benchmark is only a part of the story. We can’t collect subjective aspects of a data governance program like how to effect organizational change, justify business value, or future implementation plans. We do ask for some of this input via our customer base survey and will try to include it if we get enough high quality input. So even the best benchmark will not give you the full picture and comparison of your program against others. You will still need to build relationships through the Collibra community and meet with your peers to triangulate and come to your own final assessment.
What can you do now?
If you are anxious to get started, then let me offer you a bit of food for thought. Don’t wait for Collibra to start creating you own internal benchmark. Go ahead and use the concepts above to start designing your own internal benchmark for data governance adoption across departments, data source owners, and more. Then, when your internal stakeholders ask you ‘how do we compare to others?’, you will have an answer that will give them the confidence to move forward.
John has 25+ years of business intelligent and data management experience in a number of diverse leadership roles including responsibility for consulting, sales, marketing and product development. During this time he worked with a number of pioneers of data warehousing, has been a frequent public speaker and developed deep expertise in BI methods, data warehouse design, analytic techniques, master data management and meta data management. He has been a principal at several silicon valley software startups and consulting organizations that where acquired at separate times by both IBM and HP.