In this post, we will look at what Probability is and its application in Lean Six Sigma context. We will also talk about discrete and continuous Probability Distribution and look at their characteristics.
Before we delve into probability distribution, lets take a quick view of basic Probability concept in general.
What is Probability?
You must have encountered multiple instances when you roll a dice or flip a coin. We know that the result of flipping a coin will either be a heads or a tail. That of rolling a dice is any number between 1 and 6, both included.
Simply put, Probability is the chance of getting a heads (or a tail) for each instance of rolling a dice.
Generalizing the above example, probability is the likelihood that a particular outcome (event) will occur for a trial.
Here, ‘Trial‘ is each instance when you flip the coin. The possible outcomes are Heads and Tails. ‘Event‘ is the particular or specific outcome that we are looking for. Thus, probability is the likelihood that a specific event or outcome will occur in a trial. Simple.
You can calculate probability of a specific event using the below formula. You will need to know the total possible outcomes for each trail to calculate probability. (2 possible outcomes when you flip a coin and 6 possible outcomes when you roll a dice, right!)
The probability of getting a ‘Heads’ (event) in the next coin flip (trial) is 50% or 0.5 as there are only two outcomes possible. And either of them can occur. Similarly, the probability of getting a score of 6 when you roll a dice is 1/6, that it 0.167 or 16.67%.
Probability in Six Sigma
We also need the basic probability concepts in Lean Six Sigma. We surely are not dealing with the process of flipping a coin or rolling a dice. However, if you think about it, the business processes are also similar.
Just replace the process of flipping a coin with a manufacturing process. Think of a defective unit produced as an event. And each unit manufactured as a trial. This will give you the probability of a unit being defective. So the probability of a unit being defective is again 1/2 or 50% or 0.5. Sounds strange, right?
Remember, this is the probability of each unit being defective. A unit will either be defective or not. However, the fun starts when you look at the units collectively. When you want to know the possible number of defective units in a given sample size. That is where Probability Distribution comes into picture.
Instead of looking at just one unit independently, lets say you want to look at a sample of 100 units. And you want to calculate the probability of receiving a defective unit.
Six Sigma professionals will collect a defined samples from the manufactured units. They will check these units for defects. Lets say, there were 2 defects per 100 units produced. This takes us to the conclusion that, for every 100 units produced, we will get 2 defective units. So, probability of receiving a defective unit in a batch of 100 units is 2/100, ie 0.02 or 2%.
Now that we understand basic probability concept, let us look at Probability Distribution.
Probability Distribution
We now know that Probability is the likelihood of a particular event occurring for a trial. Probability Distribution, on the other hand, is the likelihood of a particular value/s that a random variable can take. Confused? Let is further elaborate this.
Let us assume, you are in the business of packing food delivery orders in a restaurant. And you are working there for the last 3 years. Will you know how much time it takes to pack one order? You will surely answer “Yes” to this question.
Will you also know the probability of the next order you pack taking more than 15 minutes? That would be tough to answer.
To answer this, you need to collect the packing time data for a good amount of orders and build the Probability distribution. Once done, it will help answer such questions quite easily.
We will talk about the types of probability distributions and their characteristics in the next section. Since the probability distribution describes the possibility of values a random variable can take, it depends on the type of random variable.
Types of Probability Distribution
In my previous post on data type for Lean Six Sigma projects (opens in new tab), we talked about two types of data. Discrete and Continuous. Similarly, there are two broad types of probability distribution depending on the data type of the random variable. Discrete and Continuous.
The type of probability distribution depends on the data type of the random variable. In the restaurant example, the time to pack an order is continuous data type. Hence the probability distribution will be continuous probability distribution. Results of flipping a coin or rolling the dice are discrete data type. Hence the distribution you get is discrete probability distribution.
Before we get into the details of continuous and discrete probability distributions, here are a few key points to remember.
Sum total of all probabilities of all values that a variable can take will always equal to 1. The probability of a particular event will always be between 0 and 1, both included. It cant go above 1 or above 100%. It cant come below 0 or below 0%. And it surely cant be negative.
Continuous Probability Distribution
You get continuous probability distribution when the random variable is continuous in nature. This is the also represented by a frequency plot or histogram plot using past data. Do give a read to my previous post on Histogram to understand it better (opens in new tab).
Let us look at an example. We spoke about the time taken to pack food delivery orders. If you collect the time taken data for last 50 orders, you will get the frequency data for each interval or bin. Using this, you plot the histogram. When you connect the end points of all these bars on the histogram by a smooth curve, you get continuous probability distribution curve.
Master Lean and Six Sigma Acronyms in No Time!
The Ultimate Guide to LSS Lingo – Yours for Free
Subscribe and Get Your Hands on the Most Comprehensive List of 220+ LSS Acronyms Available. No more searching for definitions, no more confusion. Just pure expertise at your fingertips. Get your free guide and other ebooks and templates today. Download Now!
Continuous Probability Distribution Calculations
The total area under this curve is always equal to zero. In a perfect normal distribution, the mean is at the center of the curve. The probability of an order taking more than the mean time for packing is 0.5 or 50%. The probability of the order taking less than the mean time is also 0.5 or 50%.
However, for continuous probability distribution, the probability of the random variable taking one specific value is almost zero. This is because the random variable, being continuous in nature, can take any of the infinite possible values. The time taken for one particular order can vary from 0 minutes to any possible number of minutes, almost infinite. Hence the possible outcomes are infinite and you are looking at probability of one specific outcome from infinite possible outcomes.
Hence statisticians always calculate the probability of a range of value for continuous variable and not for a specific value. This means, you can calculate the probability of the next order taking 3 to 4 minutes to pack or more than 4 minutes, or less than 3 minutes and so on. But you can’t calculate the probability of an order taking exactly 3 minutes.
There are multiple distributions under continuous probability distribution. The shape of the distribution curve also decides the type of distribution that the data follows.
Some of the well know and common distributions are Normal, Weibull and Lognormal distribution. Normal distribution is the most common and most important continuous distribution for Lean Six Sigma practitioners. This is discussed in details, click here to read (opens in new tab).
Discrete Probability Distribution
When you are dealing with a discrete random variable, you will get a discrete probability distribution.
The variable assigned to the result of flipping a coin or rolling a dice is discrete in nature. It can take fixed set of values. Or the number of units produced every day for that matter. These will give you discrete probability distributions.
As we saw earlier, the continuous probability distributions are always displayed in form of a curve. This is because the variable can take any of the infinite possible values. And the total area below this curve is equal to 1.
However, for discrete probability distributions, the values that the variable can take are finite. For rolling a dice, there can only be one of 6 values. And flipping a coin can only result in 2 outcomes.
Thus, discrete probability distribution are not represented in form of a curve. They can be simply put up in a tabular format. Each outcome will have a non-zero probability. And the sum of probabilities of all the outcomes will be equal to 1.
Some of the well know discrete probability distributions that you can use are Poisson distribution, Binomial distribution and Uniform distribution.
Poisson distribution is usually useful for discrete count data, Binomial distribution for discrete binary data and Uniform distribution for discrete ordinal data.
Here is another great resource to further understand statistical distributions for Lean Six Sigma from ISIXSIGMA. Do give it a read.
By the way, do check out the Certified Lean Six Sigma Black Belt Handbook – it is one of the most essential guide for anyone trying to get certified as LSS Black belt or in general wants to understand LSS and improve processes. – check it out here.
Thus probability plays an important role in Lean Six Sigma context as well. I would like to know if you have encountered any instance where probability distribution helped or If you have any questions, in the comments below.
Master Lean and Six Sigma Acronyms in No Time!
The Ultimate Guide to LSS Lingo – Yours for Free
Subscribe and Get Your Hands on the Most Comprehensive List of 220+ LSS Acronyms Available. No more searching for definitions, no more confusion. Just pure expertise at your fingertips. Get your free guide and other ebooks and templates today. Download Now!
Sachin Naik
Passionate about improving processes and systems | Lean Six Sigma practitioner, trainer and coach for 14+ years consulting giant corporations and fortune 500 companies on Operational Excellence | Start-up enthusiast | Change Management and Design Thinking student | Love to ride and drive
Top site ,.. amazaing post ! Just keep the work on !
Great wordpress blog here.. It’s hard to find quality writing like yours these days. I really appreciate people like you! take care
Great article on probability distributions and its importance in Six Sigma. The step-by-step explanation of different types of distributions and when to use them is very clear and easy to understand. The examples provided helped me to understand the concept better. Understanding probability distributions is a key part of statistical analysis in Six Sigma and this article provides a great foundation for that. Thanks for sharing this valuable information.
The author of this article has a way of making complex concepts seem effortless. Their explanation of probability distributions in Lean Six Sigma was engaging, informative, and helped me understand the topic much better. Thank you for such a great resource!
This article provides a comprehensive overview of probability distributions in the context of Lean Six Sigma. The language used is straightforward and the examples are spot on. I would definitely recommend this resource to anyone looking to deepen their understanding of the subject.
Good Learning.