One of the goals of our Data Office is to create data products that support our business. These can be simple dashboards, AI-driven models, or recommendation engines. If you are new to data products and wondering how to best start, which people you need to involve, and what steps to take, you’ve come to the right place. In this blog we’ll walk you through our creation process in the context of a data product we recently deployed for our colleagues.
Our 8 step process
These eight steps are crucial to making successful and valuable data products.
1. Identify the ask from the business
The data product process should always start with identifying a need from the business. Unless you identify the business need, you risk wasting resources on something that never gets used and thus adds no value. Storage and compute may be cheaply available, but the time and attention of a good data team will always be a scarce resource.
2. Define the data product owner
Data product owners are responsible for identifying target users, when and how the data product can be trained in its use, and how it will fit into the business process. Since a data product can potentially expose confidential data, data owners must decide who gets access to the data product. They also are responsible for the internal data sharing agreement and the data quality.
For prioritization, it is key to interact with the different stakeholders even if just to set the right level of expectations. There is no silver bullet to determine what the most important data product is, but data product owners must consider the value, cost, time and purpose of the data product.
4. Iterate the prototype, evolve the requirements
We start this step by bringing data product builder(s) together with the owner to get details on all requirements: what insights are needed, what data should/could be used, target audience, usage frequency, by when is the data product due, how should it reach the audience?
With these requirements clearly defined, the builder can start sketching the data model and identify what data may need to be moved into the data platform.
Then, the different iterations of the data product prototype are shared and discussed with the owner, typically leading to adjusted requirements. This agile working method makes it possible to identify previously unknown requirements.
5. Create the data product
First, we must choose a data architecture that can reliably serve up the data product. You’ll need to include colleagues from legal and information security for this because you need to consider data privacy and data protection.
In our data product example, we didn’t have the required data in the data lake yet so we wrote the pipeline that ingests the contract acceptance logs from the source system into our data lake. (See architecture)
6. Final documentation
Once the data product is running in production, we have to make sure that the data product is well documented in our Collibra Data Intelligence Platform. That means that any new data set is registered in the Catalog, including its business context and relevant metadata such as column names, descriptions, and a relation between data product and data set. That data product itself is also “shoppable”, making it easy if someone else would like to have access to the report.
The data and data product are now well documented, which means that the owner and builder can sit together and finalize the project. In some cases, the department has to be made aware of and get trained in using the data product.
Our system implementation is very versatile which means that changes can easily be made. Monitoring is needed on the data product that is now running in production, keeping an eye on possible hidden bugs and outages. A follow up can be planned with the owner so that potential problems and new requirements could be discussed (if any) and the loop can start again.