ATB Financial provides a diversified set of financial services to more than 770,000 residents of Alberta, Canada. Being a regionally focused institution, the group is dedicated to knowing its customers intimately, understanding their needs and providing products and services that help them achieve their goals.
Delivering on that mission requires a wide variety of data-driven decisions. From the smallest tactical responses taken in response to real-time triggers, through to strategic decisions based on long-term market trends – data holds the key to better decision making.
Documenting tribal knowledge
Like most financial organizations, ATB’s technology architecture included a wide variety of disparate systems that were difficult to navigate. Detailed knowledge of the way data flowed through those systems was tied up with individuals. The organization sought a way to replace that tribal knowledge with a system that could drive Data Intelligence in a more automated fashion. “We purchased Collibra to solve that problem. It was a fantastic solution for us in enabling us to track source, lineage and classification of data in a central location as opposed to disparate repositories,” says Chris Fergusson, data management analyst at ATB Financial.
With the initial Collibra implementation having proved a success, ATB made the decision in 2019 to modernize its data infrastructure. A cloud-based solution was deemed the only logical path forward. After evaluating leading providers, ATB selected Google’s BigQuery as its platform of choice.
“The number one selling feature for us was power. Just as an example, we had one financial reporting query that took about eight hours to run on our legacy on-prem system. It ran in 23 minutes the first time we tried it on BigQuery,” says Fergusson. In addition to power, several other features attracted ATB to the cloud. Total cost of ownership, stability and availability were all contributing factors, as were the variety of tools available to support data preparation and analysis. Information security was also seen as a positive. “Cloud has great auditing. Everything is encrypted. And it is a very secure platform. Especially if you consider the uniformity of the security,” says Fergusson.
Driving agile data operations
Another key advantage of cloud infrastructures is that they have the potential to support agile development methodologies, known as DevOps. To unlock that potential, ATB set about leveraging its Collibra platform to drive automation of its data pipelines.
Data pipelines are typically split into two broad categories. The first consists of high volume, scheduled deployments, where dozens or potentially hundreds of tables are ingested in a repeatable fashion (typically bi-weekly), thus making the workflows more straightforward to manage. The second category supports custom developments where developers lodge ad hoc requests to ingest new tables (or updated table versions).
In both cases, the goal is to automate the processes by which data can be classified and dynamically masked. These automations focus on three key areas. The first involves automatically ingesting table-related metadata from both Google BigQuery and Google Data Catalog into Collibra. This content then provides the basis for Collibra to automatically classify data at the columnar level. Those classifications are then pushed back out for use in the dynamic data masking solution to ensure data consumers can only view data under permitted circumstances. “As a failsafe default, our dynamic masking solution masks the data in the table until Collibra’s classifications un-mask them,” says ATB Financial data governance manager Warren Whale.
Demonstrating clear benefits
The automations enabled by Collibra have already succeeded in reducing processing time from “days to minutes” notes Whale. As such, Collibra’s role within the organization has “dramatically expanded” from being predominantly focused on documenting data assets to “now being a critical step in the DevOps process,” he adds.
Automating data pipelines has also provided a number of ancillary benefits to the organization:
Improved development frequency: Automation of biweekly pipeline releases with regular ingestion and classification of hundreds of tables at a time, leading to increased productivity for data engineers
Faster time to delivery: Much faster turnaround of new tables, views and tests to ensure data engineers can continue innovating at pace.
Lowering failure rate of new releases: By adopting standardized classifications ATB Financial has been able to reduce issues relating to data inconsistencies.
Shortening lead times between fixes: Rapid classification of data has also allowed for user feedback to be quickly incorporated by data engineers.
Improved mean time to recovery: Finally, Collibra generated scripts have been incorporated into disaster recovery plans, thereby helping to lower operational risk.
Self-service access to trusted and governed data
It is important to note that the ability to automate data pipelines and support innovation through more agile operating models has only been possible because of the way ATB has implemented controls over data access, even when the data request is through a BI tool like Tableau. “Instead of having users get access to systems based on their request, we wanted a data lake where all employees had access, but their access was controlled and limited by their role,” says Fergusson. “[With Collibra and BigQuery], data is categorized at the columnar level so we can trust that people will not have access to anything that is either sensitive (based on their role) or that hasn’t been categorized yet. It is fail safe. If our data governance team has not categorized the data then it does not get exposed, making it a very safe environment for sensitive data,” he adds.
Encouraging adoption and further innovation
Rolling out the new cloud analytics platform has gone smoothly and taken half the time of a comparable on-premise implementation. Focus has now turned to migrating users over to the new platform. A key part of that process is to ensure enough education, training and support is available to help in that process.
The core data governance team has a range of responsibilities, including administering the platform, investigating instabilities, expanding out capabilities and helping with system integrations. However, one of the most important roles that they play is providing support to the user community. “When things go wrong users need somewhere to go to. Having a central body of expertise helps with that,” says Fergusson.
In addition to driving adoption, the team is continuing to investigate new ways to make its processes even more streamlined. “We are continuing to deploy additional automations as the DevOps process further evolves and look forward to using new capabilities from Collibra,” says Whale.