Becoming Data Driven, From First Principles - Commoncog

Note: this is Part 12 in a series of blog posts about becoming data driven in business. This piece is the culmination of 1.5 years of theory and practice, and sets up for a two-parter explaining the Amazon-style Weekly Business Review.


This is a companion discussion topic for the original entry at https://commoncog.com/becoming-data-driven-first-principles

Iā€™m really curious about why SPC is such a niche knowledge, limited primarily to the manufacturing sector. In the 1990s, GE popularized Six Sigma techniques, and there was a lot of interest in expanding those methods to other domains. In my own survey of the literature, I found a number of thinkers that explicitly transferred these techniques to the domains of healthcare, service sector work, and education. And thenā€¦it all sort of disappeared? Iā€™m not involved enough in healthcare to know if SPC is still practiced there, but Iā€™ve seen enough to be confident that itā€™s all but forgotten in the service sector and education.

I feel like I must be missing something. Iā€™m fully on the WBR train. So if it didnā€™t stick in these sectors, either there is some complication that makes it harder, or less powerful, orā€¦what? Business folks just decided they didnā€™t like using tools that would make them better at their jobs? That sounds pretty stupid. So whatā€™s a better explanation of what happened? Why arenā€™t XmR charts as common in business as Pivot Tables?

2 Likes

The very last chapter in Mark Grabanā€™s Measures of Success contains some details about why itā€™s so hard to teach. (Graban has been teaching this for years, mostly in hospital settings). The gist of it is that most people donā€™t see variation as a problem.

If they donā€™t internalise this, theyā€™re not going to use XmR charts. Which makes it hard to do everything else: itā€™s hard to teach them the WBR, itā€™s hard to teach them the process control worldview, itā€™s unlikely that they will take the entire approach seriously.

The structure of this essay is actually set up to deal with this assumption. Nearly the entire first half is written to illustrate the problem with variation, and to illustrate what amazing things one can do once you have a way of dealing with variation.

But I still expect it to fall flat for some people. Iā€™ve tried my best, but: https://x.com/ejames_c/status/1751817369968890081?s=20

3 Likes

This post is really really good! The way you describe how many companies fail to be data-driven resonate a lot with what Iā€™ve seen. And also that thing when you look at a chart and the number goes up and you donā€™t know what it means. Totally resonating!

Being data-driven, in this sense of pursuing knowledge, is one of the things Iā€™m really interested about (since I wanna start my own company and Iā€™d like to reduce my reliance on luck and faith and deities a bit). So far my success with building causal models has (probably) been because I focus on conducting JTBD interviews and customer discovery very frequently and intentionally. When doing that, it very much feels like Iā€™m slowly building up my intuition about what kind of levers we can pull to get users to engage in certain desired behaviors.

I suspect that once I acquire the skillsets that allow me to use data to build up my causal model, Iā€™d feel come to feel that intuition faster and more clearly (and more methodologically, I guess). Very very excited with this series and the upcoming software that would help people become more data-driven.

3 Likes

Can you say more about this? If anything, I would have expected the opposite reaction: that people are so focused on making charts that go up and to the right that they overreact when thereā€™s any wiggle, regardless of whether itā€™s signal or noise.

2 Likes

Thatā€™s exactly right! With the optimisation worldview you might think ā€œahh, metrics are just for improving some conversion rate.ā€ With the process control worldview, metrics become an opportunity for accelerating product/business intuition.

Btw, do you still think the North Star Metric Framework is compatible with this approach to metrics? :wink:


@colin.hahn So ā€¦ this is my interpretation of Grabanā€™s book, but Iā€™ll just post a couple of screenshots here and we can discuss interpretations. Iā€™m ā€¦ not entirely sure if this is the right reading of the following sections:




Edited to add: I have noticed the ā€œshould I really put in the work to adopt this new method of thinking?ā€ resistance, since XmR charts and dealing with variation really do force you to think differently about reality ā€¦ though itā€™s usually traded off with a ā€œbut you no longer have to treat your business like a black box!ā€ pull factor.

1 Like

How often do incentives play into data ā€œoutcomesā€?

1 Like

Haha In my mind somehow Iā€™ve always thought of North Star Metric Framework as less of a data-driven tool and more like a way to encapsulate your causal model so that you can communicate that and align with your team. I used NSMF in the past after Iā€™ve developed a causal model for my product, and Iā€™ve found that to be a really great strategy deployment tool. But I wouldnā€™t use it right from the start though.

Right now my only tool is to just ā€œget out of the buildingā€ and talk to a bunch of customers, in the process trying to build up intuitions about the what makes them tick. Thatā€™s been the only strategy that Iā€™ve employed (with success) so far. But in certain environments you donā€™t get to talk to your customers that much, so Iā€™m not sure if Iā€™m able to thrive in such an environment (though Iā€™ve never been in such an environment).

Would being data-driven in such orgs make sense? Or put it another way: do you think that being data-driven, in the sense that you mentioned, could completely replace customer discovery and actually taking to people?

3 Likes

I think one wonderful thing about the idea of ā€˜knowledgeā€™ is that you can just ask yourself ā€œwhat can give me more marginal knowledge about the customer?ā€ And sometimes (often!) this is ā€˜talking to the customerā€™, and sometimes this is ā€˜instrumenting the product so we can see what the customer actually finds valuable.ā€™

So I donā€™t see any reason you should give one up when you can also do the other.

This is one point I wanted to make but I guess I would save it for a later essay. During our podcast together, I asked Colin if the methods behind the WBR would work for a pre-product market fit product. And he said something to the effect of ā€œOf course! Often you need to get a whole bunch of things right to get a successful product. (Cedricā€™s note: like if youā€™re launching a streaming service you need to ensure the video selection is large enough and the streaming is fast enough and latency of the services behind it is low etc etc). If you donā€™t instrument those things, then how do you know if your product failed because the idea was bad or if your execution was bad?ā€

And I think thatā€™s a damn good point. Itā€™s worth recalling that while Amazon Prime itself was an intuitive bet, they instrumented the hell out of it just to make sure they had actual knowledge of consumer behaviour + program behaviour + financial performance over the life of the entire bet, up to and beyond the point it was proven two years later.

And all throughout that process they were subjecting program costs to process control: constantly iterating to see if they could bring overall costs of Prime down to a controllable, predictable level, before it bankrupted the company.


A huge deal. I think one of the more depressing things we learnt investigating this body of work is that if you donā€™t have the power to structure incentives, you arenā€™t really going to be able to execute the full dream scenario described in the ā€˜What If It Doesnā€™t Have To Be This Way?ā€™ section. Every example I gave at the end of the essay (Amazon, Koch, Ford, etc) was of a CEO-led initiative, which enabled (forced?) the various departments to work together.

3 Likes

Interesting. I feel like Graban himself doesnā€™t have a clear answer. Heā€™s recognized that there is a skill around understanding the behavior of the system as a system. Thereā€™s a chicken-and-egg problem where PBCs make it easier to see the system perspective, but you need to be looking from a system perspective to appreciate what PBCs are giving you.

Iā€™ll keep thinking about this.

2 Likes

I work in software engineering and while I donā€™t want to say ā€œit would never workā€ it definitely feels harder. All the things that are easily measured in software (lines of code, stories completed, PRs requested) are things that I am at best neutral towards. The things I do want more of are famously difficult to measure (impact and outcome from Measuring developer productivity? A response to McKinsey).

On the flip side, I just made an XmR chart for my weight loss efforts so far this year and it seems perfect for that!

2 Likes

@erikwiffin : In your role, what degree of ownership do you have to impact/outcome? I ask because, depending on what your work entails, what you measure to show those will change.

If your role involves vetting stories for relevance, then your metrics might be around customer satisfaction with stories delivered and cycle time for stories. If your work is less customer-facing and more about delivering code that does what the product manager specifies, then your metrics might be around cycle time, rework rates (how many times did this have to be adjusted because we had ambiguous specs), and the like.

3 Likes

Going back to the question around this comment: ā€œThe gist of it is that most people donā€™t see variation as a problem.ā€

Is the challenge that they donā€™t see variation as a problem, or more that people donā€™t have an intuition / understanding that there should be a normal amount of variation? Because without that understanding itā€™s even harder to explain the concept of exceptional / beyond normal variation (which is what itā€™s really all about).

To me this is the biggest challenge I see with data analysis. Even very smart, experienced people; people that worked with all sorts of charts, reporting, analysis, etc, will still often react to the smallest change in trends as if itā€™s meaningful signal to explore as opposed to just normal variation.

4 Likes

I think this is precisely it! Damn, thatā€™s a great articulation.


:rofl:

I suspect the best thing to do re: software is to make software engineers directly responsible for product outcomes. Basically teach them the process control worldview, and then empower them to materially bend the numbers that capture some form of user happiness.

I know this is easy to say, hard to do (politics and org design etc etc) but this was basically the insight behind the argument here ā€” where early Amazonā€™s engineers knew they were responsible for attempting to bend either price, selection, or convenience on their core flywheel.

2 Likes

If you want a quick practical guide from a software engineer who has also put SPC ideas to practice, see:

Statistical Process Control: A Practitionerā€™s Guide

It doesnā€™t cover all the edge cases weā€™ve found from practice, but it definitely gives you more than what Iā€™ve given you in this essay.

Edited to add, two observation he makes that I quite like/didnā€™t think to make:

  1. He uses it in a software engineering self improvement / team improvement context. Perhaps because Iā€™m such a business nerd, I did not think to use these methods in this way!
  2. He points out that using these charts a lot and internalizing the worldview changes you a little: you become a lot more aware of, and accepting towards the role of randomness and luck in reality.

Perhaps not an accident Iā€™ve been talking about luck a fair bit after using SPC methods in practice last year :thinking:

4 Likes

@colin.hahn thankfully more like the first. CSAT probably wonā€™t work (we run a content business, customers wouldnā€™t distinguish between satisfaction with content and the software that delivered that content) but cycle time sounds difficult but possible. Stories are of varying size - the point that Iā€™d expect that to overwhelm any process variation. Not even touching how gameable that metric is (create a bunch of extremely small stories, cycle time goes to zero).

@cedric ā€œhard to doā€ is understating it! Iā€™m trying to implement SPC for myself and my team, not change the entire org chart. Iā€™m also not 100% sure I think thatā€™s a good idea. Product development is a useful skill for engineers, but itā€™s not the only skill. Should the junior engineer just getting up to speed on our tech stack also need to learn product development? What about the senior engineer who is really good at building things, but doesnā€™t care what those things are? Those are both archetypes of engineers that I would expect to see on healthy engineering teams, but I feel like theyā€™d be driven out by making them directly responsible for product outcomes.

Thanks for the practitionerā€™s guide though - Iā€™ll read through that and see if anything clicks.

2 Likes

Highlighting the parts from the practitionerā€™s guide that jumps out to me

The above also leads to something known as the report card effect : if you try to aggregate too many physical processes into one summary metric, that metric will always be a stable process, meaning it loses its power as an indicator of when something goes wrong. You must look into processes in reasonable detail in order to have meaningful metrics. If you summarise too many things into one number, you average out all the useful signals into noise.

Process behaviour charts are not useful only for time series. Theyā€™re very common for time series, but you can also apply them to other things. ā€¦ When dealing with data points attached to people, a common trick is to order the data points alphabetically by name. This is in practise the same as ordering them at random (since we can think of names as randomly assigned to people) but looks nicer in a report.

When you have a stable process such as this, you donā€™t have to re-compute the process limits each week. One of the defining features of a stable process is that any given week, statistically, looks like any other week. Because of this, you can just extend the process limits you have already computed indefinitely into the future.

Goals are wishful thinking [aspirations?], and do not on their own improve things ā€“ they only make things worse.

This is classic Deming. But how does this square with Amazonā€™s OP1/OP2 goal setting? @cedric

2 Likes

coolā€¦i pulled that articulation basically from your essay and tweets in general :grinning:

Continuing on that: For me, one of the big insights from your writing on this topic is that XmR charts are a highly effective way to explain the idea of normal variation to people that donā€™t intuitively think that way yet. When you just use an average, itā€™s visually a single line, and its easier to fall the for the trap that ā€œnormalā€ means a single line as opposed to normal being a range between two lines.

3 Likes

As food for thought, here are a couple of directions you might go based on that:

  • You could try instrumentation around the elements that go into customer satisfaction. Amazon did something similar by defining what customers are looking for (low price, available, etc) and then created ways to track that (what percentage of key products are listed at a price equal to or lower than the X key competitors, what percentage of product pages can be delivered within two days?)
  • If planning your teamā€™s workload is an issue, you could measure the variation in story size actual vs predicted: how much time did you think this story was going to take, and how much did it actually take? This could improve the teamā€™s estimaion capabilities, or identify where the team is consistently missing information in order to accurately gauge the effort needed for specific features.
  • I wouldnā€™t worry as much about story size being gameable if you can supplement it with customer-driven metrics. If your team is constantly delivering in fast cycles (because theyā€™ve gamed that metric) but thereā€™s no impact on the customer metricsā€“that tells a story too.
2 Likes