Many enterprises are trying to keep up with a growing data landscape that continues to create even more complexity, and makes data intelligence much more difficult to achieve. We’re excited to announce several new integrations and capabilities to help organizations accelerate their efforts to get the visibility and insight they need across their data.
As digital transformation remains a top initiative, the ongoing adoption of cloud platforms adds yet another dimension of complexity. According to IDC 63% of organizations are migrating data to the cloud, and 95% are integrating their data across a hybrid cloud. As this transformation to the cloud continues, so too does the volume of data and number of data sources.
Together with this hybrid cloud approach, IDC also cites that 65% are using a minimum of 10 different data engineering tools to manage and control their data. If getting a grip on hundreds of data sources isn’t hard enough, leveraging scores of tools and software only proliferates even more information silos.
Introducing new integrations
This is where a single data intelligence platform can help enable all teams to easily discover and access trusted data. The first step in wrangling today’s data landscape is to connect to the variety of data sources across the enterprise. The depth and breadth of integrations is a core component of the Collibra Data Intelligence Cloud platform. The newly introduced beta integrations with Snowflake, Azure Data Factory, and Google Cloud Storage continue to expand the foundation by which organizations can connect to and get visibility to their data. In addition, the latest Sample Visualization capabilities enable users to view sample data on Edge data sources in order to better understand their data and determine whether the data is right for their needs.
The first of these new integrations is with Snowflake. The previous integration supported Snowflake lineage integration by connecting to the SQL scripts within the Snowflake ecosystem and automatically parsing the lineage. With this release we’re enhancing the functionality that is available by tapping into Snowflake’s new Access History feature. This allows for the automated extraction of the lineage from all the transformations that run on the Snowflake platform, regardless of what tool is actually orchestrating them, which could be SQL scripts, stored procedures, any ETL tools you’re running, or Python scripts. With this enhanced integration, organizations get much more comprehensive end-to-end lineage and visibility of all activities that occur within the Snowflake ecosystem.
Today we have comprehensive coverage for Microsoft on-premises tools as well as Azure ecosystem tools. We’ve extended this coverage with the new Azure Data Factory integration. This integration with Azure’s cloud ETL service automatically harvests and stitches lineage from Azure Data Factory. This allows the extraction of the lineage that is derived by the workflows and pipelines that are orchestrated by Azure Data Factory, so that you can provide a complete picture of data flow from source to destination.
The Google Cloud Platform has become an integral part of the data ecosystem of many organizations. We’ve introduced the Google Cloud Storage (GCS) integration as we continue to expand our support of this important cloud platform. This latest integration allows metadata to be retrieved, mapped, and ingested into Collibra from GCS buckets, directories, and files. This metadata integration enables the discovery and governance of GCS within Collibra. Organizations with GCS assets can gain better scale access, accelerate analytics, and more confidently comply with policies.
Introducing Sample Visualization
Last but not least we’ve extended our sample visualization capabilities to data sources that are connected via our Edge infrastructure. Where sample visualization was previously only available on JobServer, this is now enabled for Edge data sources. This will allow the Edge service to gather the samples, and securely store them in the local Edge cache. Any security concerns are avoided from storing the samples in the cloud for a prolonged period. The samples will only be stored for a short period, after which they will be cleared. For additional security any sensitive data that should not be viewed by the user will also be masked. Now, catalog users will be able to get visibility into data samples from their Edge data sources, so they can get a better understanding of the data, and determine whether this is the right data for their needs. Sample visualization on Edge will be available as part of the Collibra Data Intelligence Cloud in the recent November release.
Getting visibility and insight in today’s web of data must start with connecting to the data. With the right integrations that deliver a clear understanding of how data transforms, flows and is used from system to system, organizations can begin to deliver data intelligence and extract the value locked within their data.