5 minutes for 5 hours’ worth of reading
I am a big proponent of data being a reflection of business reality. I call this concept a business data twin, but there are many variations on the same topic.
We have the entity relationship diagrams, logical data models, conceptual data models, business data models, and many more.
All are aiming to create a more-or-less accurate simplification of the reality.
I can’t stress out how much easier things get if you design and create the data reflection of your business well.
- The many layers of data lineage: I very much like using cartography and maps as inspiration for viewing a data landscape. Borja’s article focuses on data lineage and its many layers — column lineage, table lineage, modelling layer, business layer. All talking about the same ‘area’, each focusing on different attributes and elements. The approach Google maps are using, when you can easily turn additional layers on and off, as well as changing the granularity of the layers depending on the size of the area you zoom into sounds intriguing. One thing I’d add is that mapping the data shouldn’t be purely descriptive exercise. It should be creative one too to ensure the data is representative of the business reality, and fixed if not. (Borja Vazquez @ Data at Monzo)
- Data Product Canvas — A practical framework for building high-performance data products: I’m a big fan of Bill Schmarzo’s data product development canvas, so I was curious about this alternative. Similarly to Bill’s version, even this canvas (correctly, in my opinion) focuses on the (business) problem. The canvas covers vision for the product, strategy, and business — starting from the problem, to actions impacted by the product, to measures of success. In my experience, using a data product development canvas is very helpful. This one, as well as the one from Bill, can provide a great template or inspiration. And don’t forget that many data projects are in fact data products. (Leandro Carvalho @ Medium)
- Decentralized Data Engineering: Centralised data engineering is a common starting point for many organisation. This article describes what a common path the data engineering takes from there. Starting with shadow data infrastructure created by decentralised data analysts and scientists, progressing to multi-tech silos data engineering, all the way to decentralised data engineering. I’ve seen the first two steps many times. The fourth step (author’s proposed solution to the problem) is built around data practice decentralisation, automation, and interoperability. (P Platter @ Medium)
I’ve spent the second half of the week in Tallinn. It was great to see our partners, clients, and enjoy the local fresh air and lively bars. Now a quick second-leg flight from Munich back home to the family!