Collibra AI Governance and de-risking unstructured data at Ohalo


It seems like a day doesn’t pass without some AI-powered innovation making headlines, rousing markets and occasionally causing public worry. Make no mistake: Whether hype-driven or otherwise, this is the age of AI, and everyone wants a piece of the action.

But as most businesses recognize, innovation is nothing without the right governance to ensure that risks don’t get out of hand. This is even more critical with AI, where the prospect of autonomous actions could have real, lasting business implications. 

Many of Collibra’s customers already have AI fed with high quality structured data. What they want is the same thing for unstructured data — in an easy-to-use, well-integrated fashion.

Enter the unstructured data experts from Ohalo, who recently showcased their unstructured data management and classification tool, Data X-Ray with Collibra at the Data Citizens 2024 conference. Collibra provides full support for Ohalo’s clients — independent of data source types and specific data curation purposes.

AI governance is about increasing business value while retaining control

As AI continues to permeate the fabric of life and work worldwide, so too does AI-based risk increase. It’s clear that AI innovation needs proper governance if it’s going to solve rather than create new business problems.

Half of organizations and over two-thirds of internally developed applications will incorporate widespread or critical AI and machine learning (ML) by 2025. Almost four-fifths of businesses prioritize scaling AI and ML use cases to create value in this timeframe. But managing data remains the top challenge for 72% of businesses that want to scale AI. Businesses know that both AI and data are the forces that will shape their futures. How can they retain control over these forces?

Maintaining a few use cases in Excel and connecting them to data to find out who’s working on them at any one time is easy. But with more complex and a large quantity of use cases, AI use cases, companies stare down mountains of data and models simultaneously geared toward delivering business value. What happens as a result? AI governance gets harder.

This is where Collibra comes in.

The Collibra AI governance framework

“A lot of people think AI is just the tip of the iceberg,” says Wouter Mertens, Senior Director of Product Management at Collibra. “And with icebergs, there’s a lot more underneath than you expect at first sight.”

That is why Collibra developed an easy to follow  AI governance framework. Collibra designed the AI governance framework based on conversations with customers, to ensure that it matched real business scenarios rather than hypothetical ones.

The framework is built up of four different successive stages: 

  1. Define the use case
  2. Identify and understand data
  3. Document models and results
  4. Verify and monitor

Collibra believes in collaboration: Here’s why 

Good AI governance requires proper collaboration across the organization, which includes participation and buy-in from:

  • Chief data officers (CDOs)
  • Analytics leads
  • Ethics chiefs
  • Chief information and security officers (CISOs)
  • Legal
  • Privacy
  • HR
  • Business stakeholders
  • And others

These roles, departments, and functions are the elements that could make up an organization’s AI governance council (every organization will be slightly different). But working together isn’t just about intra-organizational collaboration. It also extends to partnerships.

How Collibra and Ohalo integrate

“We’ve had a relationship with Collibra for about four years, building unstructured data into the Collibra catalog,” Ohalo CEO and founder Kyle DuPont says. A new reseller agreement provides clients with full support from Collibra, while Ohalo’s Data X-Ray — promising to “create order out of data chaos” — links with Collibra. 

“From an architectural standpoint, it’s not on edge — but you can think of it as edge for unstructured data,” Kyle explains. Ohalo connects to different data source types, including:

  • Google Cloud
  • Microsoft Office 365
  • Amazon S3
  • Azure Blob Storage
  • Salesforce
  • On-premises Windows file servers
  • Network-attached storage (NAS) devices

Ohalo brings together file metadata, ownership details, and content-level analysis to build an index of “what I like to call the alphabet soup of things that companies have to care about: GDPR (General Data Protection Regulation), PCI (personal contact information), HIPAA (Health Insurance Portability and Accountability Act) — the list literally goes on,” Kyle says. Users can validate X-Ray and check the data of up to one file per second for large production servers.

Generative AI: Curated data for specific purposes

A CV search tool is just one example of how to export data at scale with generative AI. HR specialists could benefit from a model that sifts 10 million files for 5,000 resumes or CVs. But that’s hard to do with regular expressions (RegEx) and named-entity recognition (NER) approaches, two typical tactics for managing unstructured data. 

“With [Ohalo’s] generative AI, we can now do stuff like that,” Kyle explains. Ohalo customers can use custom data categories — including invoices, NDAs, passports, and PCI — for the purposes of locating or describing CVs. “You can explain in any language — since generative AI is, by its nature, multilingual — exactly what is in those documents,” Kyle highlights. 

Imagine doing the same with hundreds of millions of files — for any use case.

Govern AI and mitigate risk the Collibra way

Whether or not organizations currently rely on Collibra AI Governance, Ohalo is the perfect tool for managing unstructured data and cataloging files. Ohalo works in tandem with or independently from AI Governance.

In a world of high-risk AI initiatives, there no longer needs to be a trade-off between cost, risk, and value: AI Governance balances out the equation. With Collibra, companies can ensure that they’re making affordable, low-risk, high-value AI investments to run successful and innovative AI projects.

This article is based on Collibra and Ohalo’s discussion at the Data Citizens 2024 conference in Orlando, FL, bringing together the world’s most innovative community of data leaders to experience breakthrough solutions. Collibra puts reliable, high-quality data in the hands of healthcare data citizens.

Interested in learning more from Data Citizens?

Check out our on-demand recording

Related resources


AI Governance 101: AI and the mountains of unstructured data you hold


How to be an AI governance champion

Product overview

Accelerate your Ai program with Collibra AI Governance

View all resources

Interested in learning more from Data Citizens?

Check out our on-demand recording

More stories like this one

Jul 15, 2024 - 4 min read

How to observe data quality for better, more reliable AI

Read more
Jun 27, 2024 - 5 min read

Defining responsible AI governance with UCLA Health

Read more
Jun 18, 2024 - 6 min read

Collibra powers UC Davis Health’s responsible AI adoption

Read more