Becoming Data Driven, From First Principles

rw-book-cover

Metadata

Author: Cedric Chin , Commoncog
Full Title:: Becoming Data Driven, From First Principles
Category:: 🗞️Articles
URL:: https://news.dataelixir.com/t/t-l-vuuitc-yuuitdetk-k/
Finished date:: 2024-01-31

Highlights

But it’s really the principles of Statistical Process Control (and/or Continuous Improvement; pick your poison) that you want to internalise, because it is those principles that led to the WBR, and it is through those principles that you can come up with equivalently powerful mechanisms of your own (View Highlight)

Statistical Process Control (SPC) pioneer W. Edwards Deming has this thing where he says “He that would run his company on visible figures alone will soon have neither company nor visible figures to work with.” But Deming is also famous for saying “In God we trust. All others bring data.” (View Highlight)

New highlights added 2024-02-03

Knowledge is ‘theories or models that allow you to predict the outcomes of your business actions.’ (View Highlight)

Knowledge is more conservative than truth. It is therefore somewhat safer. Knowledge can change; truth is expected to be static. More importantly, knowledge is evaluated based on predictive validity alone. (View Highlight)

the purpose of data is to give you a causal model of your business in your head (View Highlight)

large change is not necessarily worth investigating, and a small change is not necessarily benign (View Highlight)

you’ve just invented the Amazon-style WBR. (View Highlight)

🗣️ A chart from the 40s is all you need

The PBC goes by many names: in the 1930s to 50s, it was called the ‘Shewhart chart’, named after its inventor — Deming’s mentor — the statistician Walter Shewhart. From the 50s onwards it was mostly called the ‘Process Control Chart’. We shall use the name ‘Process Behaviour Chart’ in this essay, which statistician Donald Wheeler proposed in his book Understanding Variation, out of frustration with decades of student confusion. (View Highlight)

Of all the PBCs, the most ubiquitous (and useful!) chart is the ‘XmR chart’. This is so named because it consists of two charts: an ‘X’ chart (where X is the metric you’re measuring), and a Moving Range chart. (View Highlight)

New highlights added 2024-02-03

I suspect that if you dig deeper into this, the problem is the delay between the action and seeing the effects.

Exceptional (or special) variation here means that something meaningful has changed. If a process shows exceptional variation in response to some change we’ve made, it means that our change has worked, and the process will perhaps shift to some new pattern of routine variation. But if we suddenly see some special variation that is unexpected (i.e. not the result of any change that we’ve made), then the process is unpredictable — there is something unknown that is going on, and we need to investigate. (View Highlight)

Well, an unpredictable process means that there is some exogenous factor that you’re not accounting for, that will interfere with your attempts to improve your process. Go figure out what that factor is first, and then change your process to account for it. (View Highlight)

all processes show some amount of routine variation, yes? We may characterise this variation as drawing from some kind of probability distribution. What kind of probability distribution? We do not care (View Highlight)

What the XmR chart does is to detect the presence of more than one probability distribution in the variation observed in a set of data (View Highlight)

Also: if some unexpected, external event impacts your process, we may also say that we’re now drawing from some other probability distribution at the same time. (View Highlight)

The XmR chart does this detection by estimating three sigma around a centre line. Shewhart chose these limits for pragmatic reasons: he thought that it was good enough to detect the presence of a second (or third, etc) probability distribution (View Highlight)

The other two detection rules are run-based (they depend on sequential data points in a time series) and are designed to detect the presence of moderately different probability distributions. For the vast majority of real world distributions, XmR charts will have a ~3% false positive rate. This is more than good enough for business experimentation. (View Highlight)

Sidenote: it is important to note that you should not use a standard deviation calculation to get your limit lines. A standard deviation calculation assumes that all the variation observed is drawn from one probability distribution … which defeats the whole point of an XmR chart! The XmR chart does not assume this; its whole purpose is to test for homogeneity. Wheeler takes great pains to say that you need to estimate your limits from a moving range (View Highlight)

Toyota no longer uses them. Many divisions in Amazon do not use them. And in fact, as we’ll soon see, the Amazon WBR may draw from the process control worldview, but it does not use process behaviour charts of any type (View Highlight)

Once you understand variation, you’ve basically unlocked the process control worldview; you’re well on your way to becoming data driven (View Highlight)

Colin told me that if you looked at operational data often enough, and for long enough, you would be able to recognise the difference between routine and special variation. You would feel the seasonality in your bones. “Human beings are very good pattern matching machines” he said, “You should let humans do what they’re good at (View Highlight)

Dr. Mario's 2nd 🧠

Explorer

Becoming Data Driven, From First Principles

Metadata

Highlights

New highlights added 2024-02-03

New highlights added 2024-02-03

Graph View

Table of Contents