Collibra and Databricks for open, governed data sharing

In our current digital economy, data sharing is more important than ever before. With data at the center of all business decisions, organizations must be able to collaborate around their data through data sharing agreements. Data sharing allows companies to easily and securely exchange data with customers, partners and suppliers, enabling business users to unlock the value of their data. This quick and easy exchange of data allows for more efficient and smarter insights because it expands an organization’s access to data. With data sharing, business users can access a wider breadth of data from external sources and use this data to make more informed business decisions. Access to more data helps organizations stay in step with their competitors by using shared data to adjust customer experiences based on a greater amount of business data and a deeper understanding of your customers. Data sharing also reduces costs by providing direct access to additional data outside of an individual enterprise’s data network and eliminates the operational costs of setting up new data streams and ingestion channels. Ultimately, data sharing ensures customers, partners, suppliers and business users can collaborate around trustworthy data and increase the time to insight on important decisions. 

But data sharing poses its own challenges. Due to a lack of standards-based data sharing protocol, all data sharing solutions have been tied to a single vendor or commercial product. This introduces vendor-lock risks and prevents seamless data sharing across an organization’s network. There are also challenges around privacy and global regulations preventing data sharing. Organizations must implement a governance and privacy solution to monitor and approve data sharing requests so that organizations remain compliant and avoid fines. 

Collibra provides the governance solution to data sharing, while Databricks brings the enablement of sharing data through a new, open source project for secure data sharing called Delta Sharing. Delta Sharing, Databricks’ open data sharing solution, is part of the Delta Lake project meaning the open standard will be usable by any vendor. It offers a secure way for data providers and data consumers to exchange and analyze shared data sets with tools and systems they prefer. Because the Delta Sharing protocol is based on proven, open, scalable and cost effective standards such as REST and Parquet, it avoids the old cumbersome, expensive and proprietary techniques that had been used for data sharing thus far in the industry. Delta Sharing helps the enterprise collaborate and share data in a controlled environment. It streamlines data sharing workflows complexity and ensures trust in the data. 

How Collibra supports Delta Sharing

Like Databricks, Collibra recognizes the importance of data sharing. This is why we are glad to partner with Databricks to support Delta Sharing for our joint customers. The joint capabilities will enable governed data sharing – understand how data is being shared, who has access to what data and for what purpose. This will help bolster the ability to govern data shares and access across platforms, systems, enterprises and users.

In addition to the newly developed capabilities, our current data governance, data catalog and data privacy capabilities help organizations discover, sample and govern shared data sets.   More specifically, our data catalog provides visibility into how data is being shared across complex data silos. Through data lineage, users can see how data transforms and flows as it is transported from source to destination, across its entire lifecycle. This visibility helps further enable data consumption for driving cross platform, cross business units, and ultimately, cross enterprise analytics. 

Our Collibra Data Shopping capability provides collaborative and efficient workflows that help to share data across the organization. Data consumers can easily shop for and request data, while data owners and designated privacy managers can review and approve those requests. Furthermore, with privacy by design, Collibra enables compliant data sharing. Data consumers requesting use of data must sign off on a checklist of conditions regarding the compliant use of data such as restrictions around sharing, proliferating, hoarding, and the unauthorized use of requested data. Together Collibra Data Lineage, Collibra Data Shopping and Collibra Data Privacy ensures the complaint sharing of data inside and outside an organization. We are very excited about this new Databricks capability and look forward to our continued collaboration and partnership to unlock the value of data. 

Related resources

Blog

Benefits of using Collibra with Databricks on Google Cloud

Blog

Collibra and Databricks: Taking the partnership to the next level with Databricks SQL

Blog

Predictive data quality for trusted analytics and AI

View all resources

More stories like this one

Sep 16, 2021 - 3 min read

Data governance council: What is it and why do you need one?

Read more
Arrow
Aug 26, 2021 - 6 min read

What is anomaly detection?

Read more
Arrow
Aug 20, 2021 - 4 min read

Defining data observability

Read more
Arrow