Data Lakes in the Cloud?

As a metaphor, it seems to defy the laws of physics. But today, more and more organizations are taking a look at how moving their data lake to the cloud can help them discover new efficiencies, lower costs, and drive more value from their data assets. Hadoop, the platform that arguably made the data lake possible, has been providing organizations with much-needed scalability and processing power as their data sources continue to multiply. But while Hadoop-based data lakes are great at allowing organizations to scale up, they haven’t really been that great at allowing organizations to scale down as data needs fluctuate. And, as data becomes increasingly cloud-based so are the services that organizations are calling on to parse and analyze that data in order to turn it into real business intelligence.

While these limitations aren’t necessarily deal-breakers for many organizations, others are looking for ways to apply the benefits of the cloud to their data lakes. Some are moving their data lake entirely to the cloud; others are looking at ways to manage data across a hybrid environment where their data lakes are both cloud-based on on-prem.

The goal of most organizations, of course, is to make good data more accessible to the people who need it to do their jobs from any device and any location. The best of today’s cloud platforms rely on object storage, which provides economies of scale, high availability across distributed networks and regions, and, perhaps most importantly, the metadata and other unique identifiers your data users need to discover and make sense of all that data.

And that’s a great thing. Because it opens the door to a better way to govern your data—and that’s the first step in driving real value from your data lake. With a way to apply governance—and implement a governed data catalog—across your data lake ecosystem, your data users are empowered to find the data they need from any system (remote desktop, mobile phone, or IoT device), understand the data they find, and trust that they have the best data for business-critical projects. They will also be better able to collaborate with each other to improve existing data assets or create new assets that will drive real business intelligence.

Collibra

The Data Intelligence company

We accelerate business outcomes by delivering accurate, trusted data for every use, for every user and across every source.

See all blog posts Nov 30, 2018 · 2 mins read

Data Lakes in the Cloud?

Collibra

Mar 26, 2024 - 5 min read

Migrating data to the cloud? Don’t neglect change management

Feb 21, 2024 - 4 min read

Why organizational buy-in is critical to data cloud migration

Jan 30, 2024 - 3 min read

10 must-have data intelligence capabilities for your data cloud migration

See all blog posts Nov 30, 2018 · 2 mins read

Data Lakes in the Cloud?

Collibra

More stories like this one

Mar 26, 2024 - 5 min read

Migrating data to the cloud? Don’t neglect change management

Feb 21, 2024 - 4 min read

Why organizational buy-in is critical to data cloud migration

Jan 30, 2024 - 3 min read

10 must-have data intelligence capabilities for your data cloud migration