Accelerating data delivery using OSI with Snowflake and Collibra
Introduction: Breaking down the wall
In the legacy data world, a thick wall separated the "back of house" (IT and Data Engineering) from the "front of house" (business users). Business teams placed an order, a report, a dashboard or a new AI model, and waited. When the "meal" finally arrived weeks later, they had no idea how it was prepared, where the ingredients came from or if it was truly safe to consume.
This "black box" approach kills operational velocity. It breeds mistrust. And in the age of AI, it creates unacceptable risk.
With Collibra joining Snowflake’s OSI (Open Semantic Interchange) Consortium, we are tearing down that wall. We are moving the enterprise to an "Open Kitchen" model. By combining Snowflake’s performance and open architecture with Collibra’s active governance, we create an environment where transparency isn't just a "nice to have", it’s the recipe for speed.
Here is what the "open kitchen" looks like for your key stakeholders.
For the diner (The business user): Trusting the meal
In a restaurant, an open kitchen changes the dining experience. You see the fresh ingredients being prepped. Because you can see the process, you trust the meal before you take the first bite. In the Data Cloud, Collibra acts as that window into the Snowflake kitchen, providing trust across every type of data.
- Ingredient transparency (Structured and Unstructured Data Governance): The modern menu isn't just rows and columns; it’s documents, PDFs and images. As you load unstructured data into Snowflake Stages to fuel Generative AI, Collibra provides the "nutrition label." We scan and classify these files at scale, identifying sensitive PII or IP buried in a PDF contract, ensuring business users know exactly what is feeding their AI models.
- Hygiene standards (Compliance and Access Governance): Just as a kitchen must adhere to health codes, your data must adhere to regulations like GDPR and CCPA. Collibra acts as the automated health inspector. Through Access Governance, we translate plain-language policies (e.g., " ask PII for non-HR users") into native Snowflake Row Access Policies and Dynamic Data Masking. The business user gets their data instantly, but the "dangerous ingredients" are automatically removed based on their role.
- Recipe transparency (The Semantic Layer via OSI): Just as a diner might want to know exactly how their dish was prepared, a business user consuming a KPI on a dashboard (the "meal") shouldn't have to blindly trust the numbers. Through Collibra’s integration with Snowflake's Semantic Views and the Open Semantic Interchange (OSI) standard, business users can look directly into the kitchen to see the exact "recipe" behind their metrics. Instead of guessing the ingredients of a metric like "Net Revenue", they can clearly see the actual math formulas or metrics (such as aggregating gross revenue minus discounts), the underlying facts (the raw, row-level numerical ingredients), and the dimensions (the contextual categories like region or date) used to calculate it. This ensures a single authoritative definition across the enterprise, allowing users to consume their data with complete confidence.
The result: When the business trusts the quality of the meal, they become frequent patrons. They abandon their siloed "shadow spreadsheets" and confidently return to the open diner of Snowflake as their single, trusted source of truth.
For the chef (the data engineer): Context is key
The open kitchen isn't just for the diners; it empowers the chefs (Data / AI Engineers). In a closed kitchen, chefs work in a vacuum. If the menu changes, they are the last to know. Providing the necessary context information takes more than a definition or two, the diagram below is representative of the level of semantic details to govern and share to enable the data / AI engineer.
With Collibra providing the semantic context on top of Snowflake, the chefs operate with precision:
- The Chef’s headset (The MCP Server): Imagine a sous-chef (your AI Agent) who needs to know the exact recipe for a "Revenue" metric. Instead of guessing, data engineers and AI agents can now rely on a Governed Semantic Layer. Because Collibra tracks this semantic layer, engineers know exactly which verified facts, metrics and dimensions to feed their Snowflake Cortex AI models. By leveraging Open Semantic Interchange (OSI) or Snowflake Semantic View formats, Collibra provides the exact semantic context needed. This ensures that whether a user is querying a Cortex AI agent or viewing a company BI dashboard, they get the exact same consistent answer every time—eliminating AI hallucinations and ensuring your models only "cook" with verified business facts.
- Recipe management (AI Governance): As your team builds models in Snowpark, Collibra’s AI Governance acts as the recipe book. We register every model, tracking its lineage back to the specific Snowflake training tables. If a dataset is flagged for quality issues, Collibra alerts the engineer immediately, preventing them from baking a model with spoiled ingredients.
- The Chef’s real-time prep-flow map (Agility) : Businesses change fast. Because Collibra provides a live map of data dependencies, when the business pivots, engineers can instantly see which Snowflake tables, models and pipelines need to be updated. No more guessing.
The special sauce: The OSI advantage
Snowflake’s commitment to open standards, specifically Apache Iceberg and Polaris Catalog, means your data isn't locked away in a freezer you can't open. By leading the OSI (Open Semantic Interchange) effort, Snowflake ensures the focus remains on what matters most: the semantic context, not the format of a specific tool.
While Snowflake provides a powerful AI Data Cloud and interoperable architecture for high-performance data storage, analysis and AI development, Collibra complements this by serving as the active governance and semantic operating model. As engineers build AI models in Snowpark or deploy AI agents, Collibra supplies the critical semantic context—through the Model Context Protocol (MCP) or Open Semantic Interchange (OSI) formats—ensuring that generative AI applications rely on verified business facts rather than hallucinating. Ultimately, this combination allows organizations to fully leverage Snowflake's advanced AI and data capabilities because Collibra's consistent governance layer provides the necessary trust, transparency and context across all underlying data formats and infrastructure.
Whether your data resides in a native Snowflake table or an external Iceberg table managed by Polaris, Collibra’s governance layer remains consistent. We ensure that no matter how your infrastructure evolves, your visibility remains unbroken. This interoperability allows you to govern data where it lives, without expensive copying or complex translation layers.
Conclusion: Serving innovation faster
The "black box" era of data is over. In the age of Agentic AI, where models require impeccable data quality and context to function, you cannot afford to hide the kitchen.
By deploying Collibra alongside Snowflake, you are adopting a culture of transparency. You are enabling the business to trust the data and empowering engineers to understand the context. That is how you turn a data strategy into a 5-star experience.
Ready to get started? Discover how you can take your context in Collibra and enrich Snowflake Cortex with additional semantics and metadata. Collibra has published accelerators that can generate an OSI document or Snowflake Semantic View document using the context already in your Collibra instance.
Keep up with the latest from Collibra
I would like to get updates about the latest Collibra content, events and more.
Thanks for signing up
You'll begin receiving educational materials and invitations to network with our community soon.