Machine Learning News Hubb
Advertisement Banner
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
Machine Learning News Hubb
No Result
View All Result
Home Artificial Intelligence

How to Treat Data as a Product. Maximize the leverage you get from data… | by Lak Lakshmanan | Feb, 2023

admin by admin
February 9, 2023
in Artificial Intelligence


Maximize the leverage you get from data by applying product management principles

Many organizations aspire for their technology to go from being a cost to being a differentiator — this is true for data technologies as well. The way you’ll often hear this aspiration expressed is “we want to treat data as a product”.

Photo by Brands&People on Unsplash

Different definitions

A few years ago, what many executives meant by “treating data as a product” was that they wanted to monetize their data directly such as by selling it on a data marketplace. However, today, such marketplaces tend to mostly contain data created by companies that specialize in aggregating data across many sources (e.g. retail footfall, credit card receipts, product reviews). Few companies have found success monetizing their 1st party data.

So, what does it mean today when a typical business aspires to treat data as a product? There are several competing, but complementary, definitions. The Tableau definition — any application or tool that uses data to help businesses improve their decisions and processes is a data product — emphasizes the usefulness of data. The McKinsey definition — a high-quality, ready-to-use set of data that people across an organization can easily access and apply to different business challenges — emphasizes standardization. The Montecarlo definition — that data is available within the company in a form that’s usable (even if the final mile involves self-service transformations) — emphasizes data governance.

Applying Product Management Principles to Data

My preferred way to think about this is to combine the desired outcome and the process to get there.

The desired outcome is that organization will maximize the leverage it gets from its data by treating it as a product, and here the characteristics highlighted by the definitions above (usefulness, standardization, governance) are important. Like Tableau, I take an expansive view of what a data product is — datasets qualify, but so do data pipelines, dashboards, data-reliant applications, and ML models.

Desired outcomes are valuable only when accompanied by a path to get there. To treat data as a product, apply product management principles when conceiving and building data products. What product management principles? (1) Have a product strategy, (2) be customer-centric, (3) do lightweight product discovery, and (4) focus on finding market fit. I recommend adopting 10 data practices aligned to these principles:

To treat data as a product, adopt these 10 practices that apply product management principles to conceiving and building data products.

1. Understand and maintain a map of data flows in the enterprise

One key job of a product manager is simplification. Too often, when someone asks “what data do you have”, the answer is a spreadsheet of hundreds of datasets collected by surveying the varying business units across the company. This is not very useful.

Treating data as a product means that you (the data product team) maintain a high-level model of data flows in the business that can be easily communicated for discoverability. Maintain this map at multiple levels of granularity. At the highest level, for an e-commerce site, it might be:

  • Web Traffic
  • Product Catalog
  • Web Content
  • Orders
  • Inventory
  • Customer Survey

At the next level of granularity, web traffic might be broken down into session data, page data, etc. Capture how each dataset is collected, how they are processed, what roles can access and how, whether PII or other attributes are present, what quality assurances are made, etc. Also, capture the production use cases for each dataset.

As you can see, as you go from higher levels of granularity to lower-levels, the mapping starts to include details of your data platform implementation. It starts to become a data catalog.

2. Identify key metrics

A data catalog is simply a record what currently exists. It does not capture why the data is important or whether the data is fit-for-purpose. It doesn’t tell you what needs to be improved.

An important part of your data product strategy is to get alignment across the enterprise on your key metrics — what you will measure, how you will measure it, and what the target number for the metric is (goals will change over time). The universe of metrics that you track should include:

  1. Business KPIs: what business outcomes need to be enabled by data?
  2. SLA: What is the data availability? data quality? refresh rate?
  3. Engagement: How widely and how often is the data used across the company?
  4. Satisfaction: How satisfied are customers (could be internal) are with what data is available and how easy it is to use?

For our hypothetical e-commerce site, the business outcomes might involve increasing customer life time value, increasing free-tier conversions, etc. The SLA for the inventory displayed to internal purchasers (for restocking) might be that it’s available 99.99% of the time, at an hourly refresh, and is maintained to be above the next week’s predicted sales. We might want the inventory predictions to be used, not only by internal purchases, but also by logistics teams and incorporated into dashboards. And we might have a measure of how often the predicted inventory amounts are overridden.

3. Agreed criteria, committed roadmap, and visionary backlog

The data catalog is a record of what currently exists. The metrics capture what your goals are. Neither of these explains where you are going next.

It is important to adapt the product vision over time based on customer feedback, stakeholder input, and market conditions. During all this, your stakeholders will ask you for features and timelines and expect you to keep your commitments. To handle change and user feedback, you need three things:

  1. Prioritization criteria are what stakeholders agree on beforehand — this enables transparency and buy-in across the org on the product roadmap.
  2. The product roadmap itself is informed by a process of product discovery so that the team can avoid agreeing to timelines in the absence of information and prototyping. Product discovery is important and I’ll delve into this in more detail.
  3. Things that we think are important, but are yet to be roadmapped will be captured in a product backlog. Typically, the product backlog consists of customer problems that need to be solved (not features that have to be built). In many ways, the backlog (not the roadmap) forms your longer-term product vision. Organize the backlog to tell a clear story.

The roadmap needs to be high commitment — you should be able to commit to the timelines and features on the roadmap. A great way to do this is to get agreement on prioritization criteria, do product discovery, and maintain a product backlog.

For our hypothetical data product of inventory predictions a week ahead, we need to agree on how we measure how good the predictions are— is that we rarely run out? That we minimize the costs of procuring and storing the items? Is the running out at the warehouse level? Or at the company level? These form the prioritization criteria. If someone asks you to customize the inventory model for perishable goods, is it worth doing? You will initially add it to a product backlog. Then, you’ll do product discovery to determine the ROI of doing such a project — this will include the cost of increasing/decreasing refrigeration at the warehouses, for example. Only when you know the value will you add this to your product roadmap.

4. Build for the customers you have

Too often, data teams get caught up in technology slogans: they only provide APIs, or insist that everyone publishes data into their enterprise data warehouse, or expect conformance to a single dictionary.

Take a leaf out of product management, and develop a deep knowledge of who your customers are. What are they building? A mobile app or a monthly report? What do they know? SQL or Java? What tools do they use? Dashboards or Tensorflow? Do they need alerts whenever the data changes? Do they need moving averages of the data in real-time? Do they care about test coverage?

Then, serve data in ways that your target customers can use them. For example, you might serve the data in a data warehouse (to data analysts), make it accessible via APIs (to developers), publish it in feature stores (to data scientists), or provide a semantic layer usable in dashboards (to business users).

If our hypothetical inventory prediction data product will be consumed by internal purchasers (who are business users), the predictions will have to be served in the application that is used for ordering replenishments. So, the predictions will likely have to be accessible via an API for the developers for that application to use.

5. Don’t shift the burden of change management

Change and conflict are inevitable. The suppliers of data will change formats; the consumers of data will have new needs; the data velocity will change; the same data might be provided in multiple channels; your customers will move to an alternate supplier due to cost. These are not solely the problem of the team that makes the changes or the team that uses the data.

A big part of treating data as a product is to ensure that users of data are not stuck with change management responsibilities. As much as possible, make sure to evolve schema and services so that changes are transparent to downstream users.

When backwards-incompatible change inevitably happens, version the changes and work with stakeholders to move them from older versions of the data to newer versions. This might involve creating a migration team whose job is to move the enterprise from one version to the next.

What’s true of change management is also true of security. Make sure to build safeguards for PII and compliance instead of shifting the burden to users of your data products.

Suppose our hypothetical inventory prediction data product is customized to include predictions of perishable goods. If this involves requesting additional information on the items being sold, you will have to take on the responsibility of ensuring that your item catalog is enhanced for all existing items. This data engineering work is part of the scoping of the project, and feeds into the ROI of whether that work is worth doing.

6. Interview customers to discover their data needs

How do you evolve the product backlog, prioritize needs, and add to the roadmap? An important discipline is to ensure that you are constantly talking to customers and discovering what data they need to solve the problems that they are encountering. What shortcomings of the current data products are they having to work around? These problems feed into your product backlog, for you to prioritize and solve.

It is important that before any new data product idea enters the product roadmap that the need for the product has been validated by potential (internal or external) customers. Building on spec (“build it and they will come”) is extremely risky. Much safer is to build implementations of ideas that have already been validated with customers.

How do you do that?

7. Whiteboard and prototype extensively

Whiteboard the design of the data product with customers who want it. This ensures that what you land in the data platform will meet their needs in terms of quality, completeness, latency, etc. Walk through potential uses of data with them before you build any data pipelines or transformations.

One of the best tools here is a prototype. Many use cases of data can be validated by building a minimum viable prototype. Not product. Prototype.

What do I mean? If the sales team believes that building a customer data platform will help them cross-sell products, validate this by picking up a set of records from the individual products’ sales pipelines, doing the match manually, and trying to cross-sell the resulting customers.

Use such a prototype and interviews with potential users of the final product to scope the problem in terms of:

  1. what needs to be built: identify everything, from data pipelines to user interfaces that are needed for the project to succeed
  2. the ROI that you can expect in terms of business KPIs

Do this before you write any code. It’s only when you have a clear idea of what needs to be built and the expected ROI should you add the project to your roadmap. Until then, keep the problem in your backlog.

In the case of our hypothetical inventory predictions data product, you would have validated the input schema and use of the predictions with the key users of the product, made sure of how much more refrigeration warehouses can accommodate, etc. You’ll do this before you write any code, perhaps by doing the predictions in a spreadsheet and game-playing the whole set of scenarios for a wide variety of products.

8. Build only what will be used immediately

Prioritize going to production quickly over having all the necessary features built. This means that you should be using agile, iterative processes to build only the datasets, data pipelines, analytics, etc. that are immediately required.

Use the product backlog to capture future needs. Build those capabilities only after you have identified customers who will use those features and can give you feedback in whiteboarding/prototyping sessions.

9. Standardize common entities and KPIs

Provide canonical, enriched datasets for common entities and KPIs that will be standard across the business. Usually, these enriched entities power a large number of high-ROI use cases (e.g. customer data platform, content management platform) or are required for regulatory/compliance purposes (e.g. the way to calculate taxes).

Typically, you’ll have only a handful of these standardized datasets and metrics, because such enrichment requires significant collaboration across business units and reduces their release velocity.

10. Provide self-service capabilities in your data platform

You have to balance flexibility and standardization in a way that fits your organization. Do not go overboard with #9. Do not build centralized datasets that have everything anyone could ever want. Instead, enable teams to be self-sufficient. This is the microservices principle as applied to data.

One way to achieve this balance is to provide small, self-contained datasets that customers can customize by joining with other datasets in domain-specific ways. Often, this is implemented as a data mesh, with each business unit responsible for the quality of the datasets that it publishes into a shared analytics hub.

Summary

To treat data as a product, use product management principles to formulate your data product strategy, be customer-centric, discover products through whiteboarding and prototyping, and find the right balance between standardization and flexibility.



Source link

Previous Post

HOW TO MAKE USE OF DOMAIN KNOWLEDGE IN DATA SCIENCE: EXAMPLES FROM FINANCE AND HEALTH CARE | by Magnimind | Feb, 2023

Next Post

An Introduction to Hill Climbing Algorithm in AI

Next Post

An Introduction to Hill Climbing Algorithm in AI

Bonus: More Victorian holiday cards

How Power BI can be Used in the Healthcare Industry for Data Visualization?

Related Post

Artificial Intelligence

10 Most Common Yet Confusing Machine Learning Model Names | by Angela Shi | Mar, 2023

by admin
March 26, 2023
Machine Learning

How Machine Learning Will Shape The Future of the Hiring Industry | by unnanu | Mar, 2023

by admin
March 26, 2023
Machine Learning

The Pros & Cons of Accounts Payable Outsourcing

by admin
March 26, 2023
Artificial Intelligence

Best practices for viewing and querying Amazon SageMaker service quota usage

by admin
March 26, 2023
Edge AI

March 2023 Edge AI and Vision Innovation Forum Presentation Videos

by admin
March 26, 2023
Artificial Intelligence

Hierarchical text-conditional image generation with CLIP latents

by admin
March 26, 2023

© 2023 Machine Learning News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

Newsletter Sign Up.

No Result
View All Result
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.