Contact Us
Call us
Offices
Email
United States
+1 646 893 3042
Accounts receivable department
+1 646 974 0772
All other
+32 2 793 02 19
North America: USA and Canada
Collibra Inc.
61 Broadway, 31st Floor
New York, NY 10006 - USA
EMEA: Belgium
Collibra NV
Picardstraat 11 B 205,
1000 Brussels - BELGIUM
View all
Register for access
Register for access
  • Dashboard
  • University
  • Data Citizens
  • Marketplace
  • Product Resources
  • Support
  • Developer Portal
By signing up you agree to Collibra's Privacy Policy.
My Profile
John Smith
name@company.com
Data Scientist, USA
Interests
Cloud-Ready Data
Digital Transformation
Data Governance

How to create a data product: the 8 step process

One of the goals of our Data Office is to create data products that support our business. These can be simple dashboards, AI-driven models, or recommendation engines. If you are new to data products and wondering how to best start, which people you need to involve, and what steps to take, you’ve come to the right place. In this blog we’ll walk you through our creation process in the context of a data product we recently deployed for our colleagues.

Our 8 step process

These eight steps are crucial to making successful and valuable data products.

1. Identify the ask from the business

The data product process should always start with identifying a need from the business. Unless you identify the business need, you risk wasting resources on something that never gets used and thus adds no value. Storage and compute may be cheaply available, but the time and attention of a good data team will always be a scarce resource.

2. Define the data product owner 

Data product owners are responsible for identifying target users, when and how the data product can be trained in its use, and how it will fit into the business process. Since a data product can potentially expose confidential data, data owners must decide who gets access to the data product. They also are responsible for the internal data sharing agreement and the data quality

3. Prioritize 

For prioritization, it is key to interact with the different stakeholders even if just to set the right level of expectations. There is no silver bullet to determine what the most important data product is, but data product owners must consider the value, cost, time and purpose of the data product.  

4. Iterate the prototype, evolve the requirements

We start this step by bringing data product builder(s) together with the owner to get details on all requirements: what insights are needed, what data should/could be used, target audience, usage frequency, by when is the data product due, how should it reach the audience? 

With these requirements clearly defined, the builder can start sketching the data model and identify what data may need to be moved into the data platform.

Then, the different iterations of the data product prototype are shared and discussed with the owner, typically leading to adjusted requirements. This agile working method makes it possible to identify previously unknown requirements. 

5. Create the data product 

First, we must choose a data architecture that can reliably serve up the data product. You’ll need to include colleagues from legal and information security for this because you need to consider data privacy and data protection. 

In our data product example, we didn’t have the required data in the data lake yet so we wrote the pipeline that ingests the contract acceptance logs from the source system into our data lake. (See architecture)

Architecture

6. Final documentation

Once the data product is running in production, we have to make sure that the data product is well documented in our Collibra Data Intelligence Platform. That means that any new data set is registered in the Catalog, including its business context and relevant metadata such as column names, descriptions, and a relation between data product and data set. That data product itself is also “shoppable”, making it easy if someone else would like to have access to the report. 

7. Finalization 

The data and data product are now well documented, which means that the owner and builder can sit together and finalize the project. In some cases, the department has to be made aware of and get trained in using the data product.

8. Monitoring 

Our system implementation is very versatile which means that changes can easily be made. Monitoring is needed on the data product that is now running in production, keeping an eye on possible hidden bugs and outages. A follow up can be planned with the owner so that potential problems and new requirements could be discussed (if any) and the loop can start again.

Related resources

E-book

Introducing Collibra Data Intelligence Cloud

Blog

My data product journey as a freshly graduated data scientist

Blog

What is Data Intelligence?

View all resources

More stories like this one

Nov 29, 2021 - 3 min read

Shopping for data with Collibra and Okera

Read more
Arrow
Nov 23, 2021 - 5 min read

Sub-Zero shares 6 lessons learned from their data transformation journey

Read more
Arrow
Nov 18, 2021 - 4 min read

A brief history of data and how it helped change the world

Read more
Arrow