ChatGPT. Bards. Palms. Falcons. Bing. It’s official. We’ve entered another summer of AI — and this time it is a mainstream one. Before November 2022 AI used to be the domain of experts. Today, kindergarten teachers are prompt engineers.
Recent surveys reveal what you probably sense is happening: we are hurtling toward an expansion of AI applications that will touch nearly every industry and organization around the world.
- More than 50% expect AI use to be widespread or critical in their organization by 2025 (footnote 1)
- 78% said scaling AI and ML use cases to create business value is their top priority over the next three years (footnote 1)
- 70% of new, internally developed applications will incorporate AI- or ML-based models by 2025 (footnote 2)
While it is intriguing to wonder whether robots are responsible when something goes wrong, the reality is that responsibility lies with the manufacturer, the operator or with both. AI presents an immense opportunity for businesses in every industry to increase productivity and efficiency. If we want to capture that opportunity, we’ll need to find ways to control our automations.
There are a host of challenges to scaling AI, and they all start with data. For starters, just imagine the repeated cost of training an LLM on a data set that contains poor quality, inconsistent, inaccurate or incomplete data.
72% of companies say that data is the biggest challenge to achieving AI goals between now and 2025.
Source: Databricks, CIO Vision 2025
A short history of AI
AI has been around for decades, but it’s only recently that advances in processing power, data volumes, and the invention of Large Language Models (LLMs) have paved the way for the current wave of generative AI.
In 1950, Alan Turing’s seminal paper ‘Computing Machinery and Intelligence’ posed the question ‘Can machines think?’ A decade earlier, the 1939 classic movie ‘The Wizard of Oz’ presented viewers with the Tin Man, a talking machine whose search for a heart foreshadowed many of the ethical questions around AI challenging companies today.
In our own time, even before the current AI hype, we’ve been using AI to help our writing (autocorrect), shopping (product recommendations), and personal investments (robo-investment).
In November 2022, OpenAI’s ChatGPT catapulted AI from relative obscurity to a mainstream phenomenon. Today, OpenAI has one of the most popular websites in the world, with nearly 2B people visiting their site in May 2023 (footnote 3).
The success of ChatGPT’s generative AI has driven both excitement and fear. Businesses everywhere are seeking ways to leverage LLMs as fast as governments are talking about ways to regulate them.
It’s still all about data
Even though generative AI is unique in its capacity to compose language (in an infinite range of forms, from poetry over code to business strategy) that’s indistinguishable from human-produced language — at root, generative AI applications are only as good as the data that informs them.
The truth is data is the backbone of AI, and if the data is bad, the AI models trained on it will produce human-sounding language that looks good but is fundamentally flawed. The implications for companies building AI applications are profound and include:
- Biased decision-making: If your data sets are biased, AI will perpetuate and amplify bias, which can lead to biased (and ill-informed) decision-making.
- Inaccurate recommendations: AI models rely on patterns and correlations established by training data. If the data is flawed, inaccurate, or incomplete, then the predictive model is also unreliable.
- Outlier misinterpretation: Outliers and data anomalies can significantly impact AI models. If the AI is not trained to recognize them, then it may make erroneous, even disastrous conclusions.
- Security/Privacy risks: Poor data quality can expose sensitive information, which can inadvertently lead to security breaches and the unauthorized use of personal information.
- Legal/Ethical implications: Organizations may face legal consequences by making decisions based on inaccurate or biased AI inputs. Using AI to process personal data without adherence to privacy regulations (like GDPR or CCPA) can result in costly legal and reputational risks.
- Trust issues: We’ve all seen examples where recently released models hallucinated information into existence. Deploying AI systems that produce incorrect or biased results can erode public trust in your organization’s reputation.
How do you mitigate these risks? How do organizations ensure the quality, integrity, and ethical handling of the data used to train and operate AI systems?
What you need is a governance model for AI. You need AI governance.
Defining AI governance
At Collibra, our mission is to unite your entire organization with trusted data that’s easy to find, understand, and access so you can do more with your data.
We call ourselves the Data Intelligence company because we believe in the value your data assets can deliver. To achieve that you need tools, processes, and cultural behaviors that transform data into value to effectively drive decisions and fuel your business. Because it’s only as good as the data it uses, the emergence of generative AI presents the exact same challenges as we’ve helped businesses tackle before the rise of ChatGPT.
So what is AI governance? And how can you leverage existing solutions and frameworks to ensure the success of your AI initiatives?
Here’s our definition of AI governance:
AI governance is the application of rules, processes and responsibilities to drive maximum value from your automated data products by ensuring applicable, streamlined and ethical AI practices that mitigate risk and protect privacy.
Why you need AI governance
The time is coming when every business that uses AI will need AI governance.
Collibra is thrilled at the opportunity to play an essential role in helping organizations achieve their AI goals. We know from more than 15 years of experience helping organizations manage data more effectively that implementing robust data intelligence practices offers a range of essential benefits, including:
- Quickly find, understand, and trust data: With a well-structured governance framework, data professionals can easily locate the data they need, understand its context, and have confidence in its quality. This efficiency saves valuable time and resources. You can’t start training AI without the data so in a Data Intelligence organization you can hit the ground running.
- Drive a common language: Data governance facilitates the creation of a common language around data within an organization. This shared understanding speeds up decision-making, promotes collaboration, and fosters a data-driven culture. You can’t tame the model if you don’t know what its features mean.
- Leverage automation: By integrating data governance with automation tools, organizations can keep pace with the rapidly evolving AI landscape. Automated processes ensure consistent adherence to data standards, enable efficient data discovery and enhance data quality.
- Mobilize workforce collaboration: Data governance encourages collaboration among various stakeholders, including data professionals, business users and IT teams. By breaking down silos and promoting cross-functional collaboration, organizations can unlock the full potential of their data assets. Anyone who has ever managed to put AI into production knows it is an intense team sport.
- Ensure compliance and mitigate risks: Data governance helps organizations meet regulatory requirements, such as data privacy regulations (e.g., GDPR) and industry-specific standards. By implementing controls and monitoring mechanisms, organizations can mitigate risks associated with data breaches and non-compliance.
If your organization is leveraging (or planning to use) generative AI technologies, then it’s a good time to start thinking about AI governance.
Recently, Colibra’s Jay Militscher spoke with Forrester analyst Raluca Alexandru about generative AI and data governance on our informative podcast, The Data Download.
2 Gartner, The Future of AI.