AI model governance: How to track, audit, and control AI models in production

Share on:

Here’s a question most AI teams cannot answer: how many models are currently running in production?

Not in the ML platform. Not in the model registry. In production. Actively influencing decisions, generating outputs or scoring transactions in live systems. For most organizations, the honest answer is “we think we know, but we are not certain.”

That uncertainty is a governance failure. It is also increasingly a regulatory one.

And it does not discriminate between sophisticated AI programs and scrappy early-stage deployments. If models are live and untracked, the risk is real regardless of how advanced the underlying technology is.

The solution is not to slow down AI development. It is to build the governance infrastructure that keeps pace with it; one that gives teams visibility, accountability and control without becoming a bureaucratic obstacle to the work that matters.

The production blind spot

The pattern is familiar. A team builds a model, validates it internally, deploys it to production and moves on to the next project. Documentation lives in a notebook. Ownership is implicit: the person who built it, until they leave. The model runs quietly for months or years, consuming data that has since changed, producing outputs that nobody is actively monitoring.

This is the production blind spot: the gap between what AI teams believe is running and what is actually running, who owns it and what risk it carries. In some organizations the gap is small. In others, it includes models that were deprecated in name but never actually switched off, models trained on data that has since been reclassified as sensitive, and models whose performance has degraded well below acceptable thresholds without triggering any alert.

AI model governance is the policies, processes and controls for documenting, monitoring, auditing and controlling AI models throughout their production lifecycle — from initial intake through deployment, ongoing monitoring and eventual retirement.

Why model governance is distinct from AI governance

AI governance operates at the program level. It covers an organization’s overall approach to responsible AI: ethics principles, risk classification frameworks, oversight structures and cross-functional accountability. It is strategic.

Model governance is operational. It operates at the level of individual models, and includes specific artifacts running on specific infrastructure, consuming specific data, owned by specific teams and subject to specific performance and compliance requirements.

The distinction matters because organizations often have AI governance policies without model governance practices. The principles exist; the controls do not. A responsible AI policy that cannot be traced to an actual model running in production is a document, not a control.

Effective model governance is what closes that gap. It makes the policy operational at the level where AI actually affects decisions and outcomes.

The model lifecycle: a governance frame

Governing a model means governing it across its full lifecycle, not just at the point of deployment. The lifecycle includes:

Intake
Documentation
Approval
Deployment
Monitoring
Retirement

Intake is where governance begins. Before a model reaches production, it should have a documented purpose, a defined risk tier and an identified owner. Organizations that skip this step are making a bet that the model will never matter enough to scrutinize. That bet rarely holds.

Documentation covers what the model does, what data it uses, how it was trained, what its known limitations are and what its acceptable performance thresholds are. Without documentation, oversight is impossible and incident response is guesswork.

Approval establishes that the model has been reviewed by the appropriate parties — data science leadership, risk, legal — before it reaches customers or critical business processes. The rigor of approval should be proportional to the risk tier.

Deployment should be tied to governance. The model registry should communicate with the governance system so that when a model goes live, the documentation, ownership and monitoring configuration go live with it.

Monitoring is where most organizations have the largest gap. A model approved and deployed is not a model governed. Performance must be tracked continuously, against defined thresholds, with alerts that reach human owners when something is wrong.

Retirement is as important as deployment. Models that are no longer needed must be formally decommissioned. Silent model retirement — where the model is simply no longer called by any system — is not governance. It is a documentation gap waiting to become a compliance finding.

Model cards: what good documentation looks like

A model card is the governing document for an individual AI model. At minimum, it should capture:

Purpose: what problem the model solves and in what context it is used
Data sources: what data was used for training, fine-tuning and validation, including version, vintage and sensitivity classification
Performance metrics: accuracy, precision, recall, fairness metrics, and the thresholds below which the model requires review
Known limitations: what the model does not do well, what populations or conditions it may underperform on
Risk tier: low, medium or high risk, based on the organization’s risk classification framework
Ownership: the team or individual accountable for ongoing performance and compliance
Approval history: who approved the model, when and under what conditions
Change log: any modifications to the model, its inputs or its deployment context

Model cards are not optional documentation for high-risk models. They are the foundational artifact of model governance. Without them, no audit trail exists.

Drift detection: a governance issue, not just an engineering one

Model drift — the degradation of model performance over time as the real-world distribution of inputs diverges from the training distribution — is typically framed as an engineering problem. It is also a governance problem.

When a model’s performance degrades below its approved thresholds, the governance implications are immediate. A model that was approved based on 94% accuracy and is now operating at 81% is no longer the model that was reviewed and approved. Its outputs cannot be assumed to meet the standards under which it was authorized. In regulated industries, that is not a technical issue. It is a material change to a controlled system.

Drift detection, therefore, must be part of the governance framework, and not isolated in the ML engineering workflow. When drift is detected, it must trigger a governance response: escalation to the model owner, review against approved thresholds and a documented decision about whether to retrain, restrict or retire the model.

Lineage: connecting models to data and outcomes

A model does not exist in isolation. It was trained on data. That data came from somewhere. Its outputs flow to downstream systems and decisions. The full chain — from training data to model to outcome — is the lineage of an AI system, and it is essential to governance.

Collibra Data Lineage extends into the AI layer, connecting model artifacts to the data they consumed, the pipelines that processed it and the downstream applications that act on model outputs. When a training dataset is found to contain biased or mislabeled records, lineage makes it immediately visible which models were affected — and which decisions downstream may need to be reviewed.

Without lineage, model governance is incomplete. You can document what a model was supposed to use. You cannot verify what it actually used.

Governing without slowing teams down

The most common objection to model governance is that it will slow down AI development. That objection is legitimate when governance is implemented as a manual, sequential gate — a compliance checkpoint that sits between the team and production and demands weeks of paperwork.

It is not legitimate when governance is automated and embedded in the workflow. Model cards pre-populated from the model registry. Risk tiers assigned automatically based on model type and data classification. Drift alerts routed to the right owner without requiring manual monitoring. Lineage captured by the platform rather than documented by the engineer.

The goal is not to add friction. It is to capture governance evidence as a byproduct of normal development activity, so that the audit trail exists without requiring anyone to build it manually.

Collibra AI Model Governance is designed around this principle. It integrates with existing ML platforms to pull model metadata automatically, surfaces governance gaps as actionable items rather than compliance checklists and gives AI risk leads a live view of the model estate without requiring engineers to change how they work.

Regulatory drivers: EU AI Act and NIST AI RMF

The EU AI Act establishes risk-tiered requirements for AI systems operating in the European market. High-risk systems — those used in credit scoring, hiring, law enforcement, education and a growing list of other domains — must be documented, monitored and subject to human oversight throughout their operational lifecycle. Article 9 requires a risk management system maintained on a continuous basis. Article 12 requires logging of system operation to enable post-market monitoring.

These requirements describe model governance. Organizations that have not built a model governance capability before the Act’s provisions take full effect will face a significant gap, not just in documentation but in the underlying systems and processes needed to produce compliant evidence.

The NIST AI Risk Management Framework, while voluntary in the United States, is increasingly referenced by regulators and procurement requirements. Its GOVERN, MAP, MEASURE and MANAGE functions map directly to the model lifecycle described above. Organizations aligned to NIST AI RMF are better positioned to respond to regulatory requirements as they evolve.

Frequently asked questions about AI model governance

What is the difference between a model registry and a model governance system?

A model registry tracks model artifacts — versions, parameters, training runs. A model governance system tracks accountability, risk, compliance and control status. They should be connected: the registry is a source of truth that feeds the governance layer. But the registry alone cannot tell you whether a model is compliant, who is accountable for it or whether its performance has drifted below acceptable thresholds.

How many models should be subject to formal governance?

All production models should be subject to at least basic governance — documentation, ownership and monitoring. The depth of governance should be proportional to risk tier. A low-risk internal analytics model requires less oversight than a high-risk model making credit or hiring decisions. The mistake is applying no governance to the low-risk models on the assumption that they will never matter — they often do.

What triggers a model review?

At minimum: performance drift below defined thresholds, changes to the training or inference data, changes to the deployment context, regulatory changes affecting the model’s domain, and the passage of a defined review interval (typically 6-12 months for high-risk models). Reviews should also be triggered by incidents — unexpected outputs, user complaints or external findings.

Can existing ML platform tools handle model governance?

Partially. MLflow, SageMaker and similar platforms handle the technical lifecycle well. They do not handle policy mapping, regulatory alignment, cross-functional accountability or audit evidence generation. Governance requires a layer above the ML platform — one that connects technical model metadata to business context, risk classification and compliance requirements.

How does model governance support AI Act compliance?

The EU AI Act requires high-risk AI systems to have documented risk management systems, technical documentation, logging capabilities and human oversight mechanisms. Model governance — specifically model cards, drift monitoring, lineage tracking and audit trails — provides the evidence base for demonstrating compliance with these requirements.

Collibra helps organizations turn AI ambition into AI value by ensuring that the models driving that value are governed, traceable and controlled. Discover Collibra AI Governance and learn how Collibra enables end-to-end model governance — from intake through retirement — without adding friction to the teams building your AI future.

Read more about Collibra AI Model Governance and how a single platform can govern all your models in one place.

Collibra

Collibra

Enterprise AI Control Plane

In this post:

The production blind spot
Why model governance is distinct from AI governance
The model lifecycle: a governance frame
Model cards: what good documentation looks like
Drift detection: a governance issue, not just an engineering one
Lineage: connecting models to data and outcomes
Governing without slowing teams down
Regulatory drivers: EU AI Act and NIST AI RMF
Frequently asked questions about AI model governance

Share on:

Keep up with the latest from Collibra

I would like to get updates about the latest Collibra content, events and more.

Thanks for signing up

You'll begin receiving educational materials and invitations to network with our community soon.