2022 is almost over — but what a year it has been for the partnership between Collibra and Snowflake! We’ve witnessed the release and development of numerous offerings and integrations designed to drive better experience and greater value for our customers. The integrations between Snowflake and Collibra were born of an understanding that shared customers benefit by having a more tightly integrated experience with end-2-end data governance by discovering, augmenting, protecting and monitoring data. Specifically this year both companies gave greater focus to enabling governance of data policies and tagging as well as better lineage harvesting.
Snowflake has continued to innovate in support of better data governance in the Data Cloud, and Collibra is making it easier for customers to benefit from this new functionality, making it easier than ever to build out, govern, and discover all of the data needed by their various users and workloads.
Read on for a lightning round of information about each area of integration between Snowflake and Collibra. It is intended to whet your appetite and enliven ideas for the new year!
Traditionally, Collibra supported lineage from Snowflake by parsing SQL scripts through a DB connection. This fits some use cases; however, some customers want lineage based on what transformations actually happened in the database through DML.
Snowflake solved this by creating their own system view to store and express platform lineage, which now supports lineage at the column level. Collibra will be supporting this method of lineage harvesting in Q1 of 2023. Who better than Snowflake to know and track the data movements on their data platform? And who better than Collibra to express those lineage entries (both business and technical) stitched across systems?
Empowering Collibra with Snowflake native lineage:
- Gives complete visibility of Snowflake data assets from source to destination
- Maintains consistent visibility and surfaces the right data context across sources
Snowflake includes an extensive set of native governance capabilities, expressed through SQL, for tag and data policy management. This includes setting dynamic policies at the column or row level, and the ability to use object tags to automatically apply masking policies.
A new aspect of Collibra’s core data governance solution is to define, agree, and publish enterprise data policies. This new offering was released in beta at the Snowflake Summit this past summer and released as GA at Collibra’s DataCitizens22’ event in San Diego in November.
Collibra Protect has received positive reception for its ease of use and unified platform approach that leverages data classifications to build policies for protecting sensitive data. The policies are centrally created and managed within Collibra to protect and pushed down to Snowflake for on-platform enforcement. As new sensitive data is discovered and tagged, policies are automatically applied ensuring security and compliance for the data.
Data Quality and Observability
Collibra Data Quality and Observability is a machine learning driven offering that learns about your data and finds anomalies without the lengthy rules-creation process that typically stalls data quality projects. Collibra Data Quality and observability has aided Snowflake adoption by letting shared customers cleanse and govern data before their migration project and then transition seamlessly into post-migration with quality data. It has also empowered shared customers to translate data governance standards into data quality rules ensuring the business gets complete, consistent and accurate information.
Putting it into action, Collibra Data Quality and Observability has supported Snowflake since 2019 using an external processing environment. This has meant moving data off-platform and managing a spark execution environment separately.
This fall Collibra invited a select number of customers to beta-test a new method for pushing DQ processing into Snowflake, taking advantage of Snowflake’s inherent performance and scalability. This new pattern drives faster time to value and a more simplified technical footprint for customers.
Collibra Data Quality Pushdown for Snowflake allows customers to:
- Reduce privacy risk and overall costs by eliminating the risk and egress charges of moving all your data out of the Snowflake environment for DQ processing
- Get faster time to value by eliminating dependencies on Spark compute and increased process speeds
- Improve agility with the ability to achieve unlimited scale for computational needs based on demand by harnessing the power of Snowflake
Collibra Everywhere browser extension
With the Collibra Everywhere Browser Extension, data consumers receive information they need to trust and understand data without the effort of navigating to the Collibra system. Business context about data conveniently surfaces within Snowflake Data Cloud, as well as in other popular SaaS applications like Tableau, putting relevant information such as key definitions, report attributes, and related tables and data products at the fingertips of data consumers.
A common value driver for Collibra’s catalog customers is the enhanced ability to adopt new technologies more quickly. Powering the ecosystem surrounding Snowflake will be a common theme for 2023 with content and shared use-cases being showcased as much as possible. Look for content from Matillion, Tableau, Protegrity, and others with specific Collibra + Snowflake ecosystem branding.
Collibra internal Data Marketplace
Finding data can be challenging for data consumers, especially when there are potentially thousands of tables & data products to sift through. Similarly, even publishing data products can be a daunting task as companies adopt a data mesh operating model.
Data Marketplace enables data owners & stewards to publish data products, reports, tables, and datasets into a structured shopping experience with configurable gates for Data Quality, enhanced metadata around stewardship and governance approval flows, and much more.
Collibra’s marketplace also enables consumers to easily discover data products through a curated self-service portal. By leveraging a simple search interface that is powered by the metadata and metadata relations, casual data users can browse, preview, and request access to data within Snowflake Data Cloud. This will make the data already cataloged in Collibra much more valuable as users are able to easily find what they need.
Governance & technical workflows
The last mile of a data-shopping experience is where there is still lots of untapped value to be harvested. Finding, selecting, and approving data products is hard enough, but what follows is often multi-faceted, gated process flows that end up with a few possible outcomes.
Collibra’s new Beta workflow function breathes life into the last mile and lets you bring the right functionality at the right time for the right purpose. This will make it much easier and faster for customers to build out workflows for granting access to data within Snowflake, which will let governance and shopping experiences scale.
Here’s to a terrific 2023!
Hopefully, you’ve enjoyed a quick jaunt through Collibra and Snowflake’s shared landscape and, hopefully, we’ve sparked an idea or two for the coming year.
If you’d like to learn more about any functional area above please reach out to your existing account team, or click here.