Blog

How to create a data product: the 8 step process By Alexandre Tkint

  |   RSS Feeds

One of the goals of our Data Office is to create data products that support our business. These can be simple dashboards, AI-driven models,  or recommendation engines. If you are new to data products and wondering how to best start, which people you need to involve, and what steps to take, you’ve come to the…
The post How to create a data product: the 8 step process appeared first on Collibra.

One of the goals of our Data Office is to create data products that support our business. These can be simple dashboards, AI-driven models,  or recommendation engines. If you are new to data products and wondering how to best start, which people you need to involve, and what steps to take, you’ve come to the right place. In this blog we’ll walk you through our creation process in the context of a data product we recently deployed for our colleagues.

Our 8 step process

We’ve found that the process helps to get the different stakeholders involved in the right way, and at the right time. These eight steps are crucial to making successful and valuable data products.

1. Identify the ask from the business

The data product process should always start with identifying a need from the business. Unless you identify the business need, you risk wasting resources on something that never gets used and thus adds no value. Storage and compute may be cheaply available, but the time and attention of a good data team will always be a scarce resource.

2. Define the data product owner 

Data product owners are responsible for identifying target users, when and how the data product can be trained in its use, and how it will fit into the business process. Since a data product can potentially expose confidential data, data owners must decide who gets access to the data product. They also are responsible for the internal data sharing agreement and the data quality. 

3. Prioritize 

For prioritization, it is key to interact with the different stakeholders even if just to set the right level of expectations. There is no silver bullet to determine what the most important data product is, but data product owners must consider the value, cost, time and purpose of the data product.  

4. Iterate the prototype, evolve the requirements

We start this step by bringing data product builder(s) together with the owner to get details on all requirements: what insights are needed, what data should/could be used, target audience, usage frequency, by when is the data product due, how should it reach the audience? 

With these requirements clearly defined, the builder can start sketching the data model and identify what data may need to be moved into the data platform.

Then, the different iterations of the data product prototype are shared and discussed with the owner, typically leading to adjusted requirements. This agile working method makes it possible to identify previously unknown requirements. Experience showed us that requirements never will be fully “baked” at the start, so this communication back and forth between data product owner and builder(s) is essential to make great data products. 

5. Create the data product 

First, we choose a data architecture that can reliably serve up the data product. You’ll need to include colleagues from legal and information security for this because you need to consider data privacy and data protection aspects. 

In our data product example, we didn’t have the required data in the data lake yet so we wrote the pipeline that ingests the contract acceptance logs from the source system into our data lake. (See architecture)

Architecture

6. Final documentation

Once the data product is running in production, we have to make sure that the data product is well documented in our Collibra Data Intelligence Platform. That means that any new data set is registered in the Catalog, including its business context and relevant metadata such as column names, descriptions, and a relation between data product and data set. That data product itself is also shoppable, making it easy if someone else would like to have access to the report. They can have access via a workflow that starts with asking why access should be granted, by when and until when. These requests are sent to the data product owner and can only be approved by them. 

7. Finalization 

The data and data product are now well documented, which means that the owner and builder can sit together and finalize the project. In some cases, the department has to be made aware of and get trained in using the data product.Make sure that there is a clear handover and stop on the project so that the data office can pick up on the next project.

8. Monitoring 

Our system implementation is very versatile which means that changes can easily be made. Monitoring is needed on the data product that is now running in production, keeping an eye on possible hidden bugs and outages. A follow up can be planned with the owner so that potential problems and new requirements could be discussed (if any) and the loop can start again.

The post How to create a data product: the 8 step process appeared first on Collibra.