We now live in an age where data is involved in every aspect of our world. Whether it’s a suggestion of a product while shopping online, or calculating insurance rates based on your driving behavior, data is the common denominator. But now more than ever, getting complete visibility and extracting insight from data is critical to making the right business decisions.
However, most companies lack high quality data that they can understand and trust to drive the business forward. Adding to this, many companies do not publish information about data lineage to help understand the flow of data throughout the organization. This is where the integration between Collibra and Matillion, a leading enterprise ETL solution, can help enterprises get a complete view of end-to-end lineage so they can confidently trust and leverage their data assets.
Data lineage enables users to better understand how the data flows. This is especially important when the data, which can be complex as it goes through operational applications, is flowing through multiple systems within an organization. This data is then extracted and loaded into a data warehouse or data lake, undergoes additional transformations to an analytical layer, and eventually is consumed in the form of reports and dashboards. Data warehouse logs can provide information on the transformations occurring within their system, but it’s not enough to meet the needs and obtain the benefits of end-to-end lineage.
A recent IDC survey asked, “What do you expect to know about data when making data-driven decisions?” 78% of respondents answered that data lineage was needed to drive their decisions. But with the complexity of today’s data environments, organizations have been slow to address this need. Traditional approaches required developers to manually document lineage mappings. This approach is not scalable as these mappings tend to become stale very quickly.
The right approach starts with building a strong foundation of data governance, with data lineage as a core component, in order to provide a complete view and control across data. Collibra provides a data governance foundation in several ways. First, Collibra helps govern data by providing visibility to existing data assets so the right data sources can be chosen for a data integration project, and helps avoid creating a redundant integration. Next, newly created assets are then cataloged, classified and associated with the proper business context. By also enabling automatic lineage extraction, users can gain complete visibility and transparency into where their data originated.
Here the Collibra integration with Matillion really comes to light. Collibra ingests Matillion ETL transformation jobs to create a unified data lineage diagram that shows how data flows inbound and outbound through Matillion, and also stitches data objects in Matillion to existing assets in the Collibra data catalog. Users can further drill down and understand the transformations applied along the way, and conduct comprehensive impact analysis of any database changes on downstream systems. This complete lineage also allows organizations to produce reports to meet compliance requirements.
The confidence and trust that you have in your data can only go as far as your view and understanding of the data. Taking a manual approach simply can’t scale with the volume of data, and the growing ecosystem of data sources that organizations must connect to today. It’s important to automate the extraction of data lineage from data integration jobs. This is where the close partnership with Collibra and Matillion to automatically extract lineage is crucial, and when combined with other automated lineage integrations provides organizations a complete view and trust of their data.