AWS and Collibra deepen partnership to bring business content and semantics to AWS SageMaker Catalog
The typical data estate is sprawling and heterogeneous, spread across platforms, clouds and on-premises systems, each with its own tools, owners and ways of describing things. The faster that estate grows, the wider the gap between the people who produce data and the people who need to find, understand and trust it. What's needed is a governance layer that reaches across the whole estate and a way to feed each platform's context into it automatically.
Today, we are thrilled to announce expansion of the strategic partnership between AWS and Collibra. This collaboration centers on a singular mission: integrating robust business context, data governance directly into the Amazon SageMaker Catalog, the governance layer inside Amazon SageMaker Unified Studio. By combining Collibra’s governance capabilities - with the power of AWS Amazon SageMaker Catalog we are providing joint customers with a unified, trusted data experience across their entire AWS ecosystem.
Why context builds trust
Raw technical metadata alone – tables, rows and columns – is no longer enough. To make reliable decisions, teams need to know who owns a data asset, what a field actually means and whether it can be trusted. When those questions go unanswered, everyone from data scientists to business analysts loses time hunting for clarity, and trust in the data erodes.
Collibra provides the system of record that answers them: the framework where organizations define and manage their most critical data policies, business glossaries and governance standards across the entire estate.
Two strong catalogs, one governed view
AWS gives data and AI teams a powerful place to work. Amazon SageMaker Unified Studio brings the AWS analytics and AI/ML stack (Glue, Athena, Redshift, EMR, Bedrock and SageMaker AI), into one development environment, with governance built in through Amazon SageMaker Catalog. These assets are valuable well beyond AWS. Collibra is the enterprise-wide governance fabric: business glossaries, policies and stewardship workflows that span the entire estate. Bring the two together and the rich business and governance context created inside Collibra becomes part of the SageMaker Catalog the AWS users relies on.
This integration is the latest milestone in an expanding partnership between AWS and Collibra – one focused on giving joint customers open, interoperable governance across the AWS ecosystem.
Metadata, governance and access: How the integration works
The integration treats SageMaker Catalog as a governed source of truth for AWS and Collibra as the enterprise system of record, then keeps them in sync automatically. It brings two things across the bridge:
Technical and business metadata. Schemas, columns and data types arrive in Collibra alongside the business context AWS teams have already curated such as descriptions. The asset reads the same whether it's opened by a data scientist in Studio or an analyst who has never touched the AWS console.
Governance context. Metadata in Collibra such as descriptions, ownership, or classifications travel with each asset, mapping Collibra's business context into SageMaker Catalog so AWS users can benefit from the stewardship activities in Collibra.
It all runs through Collibra Edge, which processes metadata close to the source and publishes it into Collibra Data Catalog. Two points matter for any security reviewer: 1) Only metadata moves, your actual data never leaves AWS, and 2) The sync is automated, so the catalog stays current as assets are published, changed or retired. Edge also manages the outbound integration from Collibra to AWS so the business context enrichment by data stewards in Collibra become available to AWS users where they work.
Where the value adds up
Connecting AWS's place to build and run data and AI with Collibra's enterprise governance fabric turns two strong but separate capabilities into one continuous governance story:
- Context that stays in sync. Context AWS teams already create flows into Collibra automatically, and the Outbound synchronization ensures that the business context created within Collibra flows back to AWS, so the two catalogs can't drift apart and the enterprise view keeps itself current.
- A single, authoritative catalog. Every relevant SageMaker Unified Studio asset appears in Collibra alongside the rest of the estate, ending the silos, blind spots and duplicated maintenance.
- Discoverability for everyone. AWS-native datasets and models (through our AI Governance integration) become searchable through the Collibra experience people already use, no AWS access required to find an asset and understand it.
- Trust you can act on. Accurate descriptions, clear ownership and verified classifications give consumers a reason to rely on AWS data, and make it a sound foundation for the AI workloads SageMaker is built to run.
- Compliance you can prove. A complete, auditable view across AWS and the wider enterprise makes it far easier to demonstrate compliance, on demand.
- Scale that keeps pace. As AWS deepens SageMaker Catalog's reach, Edge-based incremental sync absorbs the growth — more captured in AWS means more value in your catalog, not more work.
One catalog to trust them all
AWS SageMaker Unified Studio is where a growing share of data and AI work gets done, and SageMaker Catalog governs that work well within AWS. Collibra is where the enterprise sees all of its data as one connected, governed whole. Bringing them together means your AWS-native assets are no longer a governed island — they're part of the mainland.
The combined result is a catalog that finally tells the whole story: AWS-native assets and the rest of your estate, described consistently, owned clearly, classified reliably, and traceable from source to consumption. That's the foundation for discovery your teams will trust, governance you can demonstrate and AI you can build with confidence — delivered by two partners whose platforms were designed to make exactly that possible.
Ready to bring your AWS data assets into your enterprise governance fabric? The Collibra inbound integration with Amazon SageMaker Catalog is generally available while the outbound integration is in preview.
For more details, please refer to Collibra Product Documentation.
Keep up with the latest from Collibra
I would like to get updates about the latest Collibra content, events and more.
Thanks for signing up
You'll begin receiving educational materials and invitations to network with our community soon.