What is Rational Sampling?
In this article I will focus on Shewhart's Third Foundation, Rational Sampling, however to keep this article short, I will not discuss rational subgrouping until a later date or as requested.
In my years in operations and consulting, I've seen varying successes in the implementation of statistical process control topics. The Shewhart Chart or more commonly referred to "Process Control Chart" is a particular challenge. In fact, I've found 100% of the training material offers little to no practical "how-to" explanation for rational sampling and about 90% offer little practical explanation to rational subgrouping.
That's ....
100%: No rational sampling practice guide or how-to:
90%: No rational subgrouping practice guide or how-to.
Example excerpts I’ve observed from training material for rational sampling and subgrouping;
.... it's the process of selecting a subgroup based upon “logical” grouping criteria or statistical considerations.”
.... wherever possible, rational sampling should consider natural breakpoints and the central limit theorem in selecting a sample size.
.... it's a process sampling approach with an aim toward minimizing “within group” variation in order to maximize “between group” variation.
.... it's when all of the items in the subgroup are produced under conditions in which only random effects are responsible for the observed variation.
.... it's when data exhibits only common cause variation within subgroups and special cause variation (if it exists) between groups.
While there are great statistical courses out there, these materials are often delivered from a statistician's lens and not an organization's process lens. Below are 2 of my favorite approaches to staying rational in the sampling world of statistical process control and predictive analytics.
But before we jump in, I would like you to partake in a 15 second exercise. In the image below, imagine you are driving with your eyes closed towards your next destination. Life is good and it's a beautiful day for a drive in the country. What considerations would compel you to open your eyes?
If you said you would open your eyes before "a fork" in the road or before the road "changes" direction as in the image above, you're absolutely right. Key thought here is when things change. And ideally you want to open your eyes before "shift" happens.
Statistical process control charting is like an organizational process driving with its eyes closed. Sampling is the act of opening the eyes of the process and taking a look around with data. Ideally, this sampling or opening of the process's eyes should come before the bend in the road if that makes sense. In other words, knowing when to look or when to collect the data is more important than the data itself. In other words, knowing the why behind the when in your sampling strategy is a critical success factor in statistical process control.
Let's return back to the rational sampling argument above. As you'll probably agree, most of the training for statistical process control is limited to the statistical construct of the chart versus the behavior that the chart is supposed to monitor. That's analogous to trying to learn how to play baseball by learning how to keep the box score.
Let's dig deeper by introducing the first part of Shewhart's Third Foundation, rational sampling and see if we can't unpack what Shewhart was thinking on this topic.
Walter Shewhart believed that sampling ….
…. is rational if it is frequent enough to actually monitor the changes in the process.
Let's go back to the driving exercise in Figure 1 above. If you're not monitoring the road for changes, you're going to get off track. Similarly in manufacturing or back office processes, monitoring changes to the moving parts of the process are essential to assuring your process is staying in its lane. Let's break it down further.
Take a process of interest and create a fishbone like the image in figure 2 below. During a garden variety work period like day shift or an afternoon shift, document the process elements on a fishbone. Yes, actually use people's names, procedures, lot numbers, etc. This is not a brainstorming exercise. There should be no mystery of what is happening in the process.
Take the meaningful elements from the fishbone and insert them in the left column of table 1 below. This can be done virtually using a spreadsheet or on flip-chart paper in a meeting room. What goes in the table isn't theory. It's important to have knowledgeable resources when completing this table. The middle column should capture how the moving parts could change. The "when" column captures the common frequency of when this happens. When complete, this column should give you a sense of when to collect data based on key changes in the process.
Of course this is the ideal case. Walter Shewhart famously coined two sources of variation in a process based on the changes in the middle column, common cause and special cause.
However, here's the reality. There is no such thing as common cause variation or special cause variation. If you're running a business or enterprise the two real sources of variation are;
Variation (process behaviors) you choose to control.
Variation (process behaviors) you choose not to control.
What's important to understand is the cost of appraisal versus the risk of not detecting a change or ignoring a change that could shift process performance negatively.
This short exercise shouldn't take long. As I said above, there should be no mystery to what is happening in the business. Let's take a look at the next statement.
Walter Shewhart believed that sampling ….
…. should be done in such a way as to preserve the information they contain about the process output.
Here's the key question you should be able to answer 100% of the time. What would change in the middle column of table 1 in between sampling intervals? This should not be a mystery. Remember, you're trying to monitor behavior changes through the process control chart. Preserving the information allows you to know exactly what behavior to address or take action on.
Let's do another exercise. In Figure 3 below, there are two identical processes producing the exact same output. The data for these parallel processes are being aggregated into the same data set. Is this rational sampling? Are we preserving the information about the process? Not really unless you have an enterprise wide SPC application in play. If both data sets were aggregated and plotted on an SPC chart and there was an out of control point on the chart, we wouldn't be able to distinguish the source of the special or chance cause. We would have to investigate both Team A and B process behaviors.
Let's summarize Rational Sampling. Figure 4 below shows two consecutive sampling intervals. According to the image, the calculated value for Subgroup 12 is outside of the upper control limit. In other words, there's a change in the process. If you're performing rational sampling you should know exactly what changed in the table above so you can take action to "assignable causes".
I hope this article revealed some insights to the importance of rational sampling. It is foundational to almost everything we do - observing behaviors that matter. A key point to remember.
SPC is a tool to monitor what you already know.
If you enjoyed this article, click the like button and share with your network. If you would more information with Statistical Process Control or related initiatives contact us at succeed@cikata.com. Don’t forget to click on our social links for the latest.