Locked lesson.
About this lesson
The normal distribution charts the type of variability in a process parameter that is being measured when the only cause for variation is natural random physical effects. It's the desired distribution when improving a process since it delivers a predictable level of process performance.
Exercise files
Download this lesson’s related exercise files.
Normal Distribution Exercise.docx60.3 KB Normal Distribution Exercise Solution.docx
60.6 KB
Quick reference
Normal Distribution
The normal distribution charts the type of variability in a process parameter that is being measured when the only cause for variation is natural random physical effects. As such, this distribution is the desired distribution when improving a process since it delivers a predictable level of process performance.
When to use
The Normal Distribution is used in the analysis of data during the Analyse phase. It will also be used to verify that the only variation in the solution developed in the Improve phase is random variation. It will also be used to track process performance during the Control phase as part of statistical process control.
Instructions
The normal distribution, or Gaussian distribution, is the best representative of random variation in a process input or output. This distribution is often referred to as the bell-shaped curve. Its characteristics are that the distribution is symmetric, with a peaked center and the upper and lower tails approaching zero.
The curve can be described with a standard deviation scale. Setting the “0 value” for the scale at the center of the curve, the standard deviations can be used to show the percentage of the data values within the span of different standard deviations.
The statistical techniques associated with statistical process control and hypothesis testing will rely heavily on the use of normal curves. Both of those topics are covered in more depth in other GoSkills programs.
The formula for this curve is:
|
Where σ is the standard deviation and μ is the mean value.
Hints & tips
- Your data will probably not be a perfect normal curve. However, the Hypothesis testing course will show how to determine if a normal curve is a good approximation.
- 00:05 Hi, I'm Ray Sheen, and I've already used the term normal distribution several
- 00:09 times in other modules, and I'll be using it a lot more throughout the course.
- 00:13 So let's explain what this really means.
- 00:16 The normal distribution is at the heart of statistical process control,
- 00:21 and therefore is of major importance to Lean Six Sigma.
- 00:24 The normal distribution is what you probably refer to as
- 00:28 the bell-shaped curve.
- 00:29 It is a distribution of data values for a process parameter.
- 00:33 The distribution is symmetrical with a peak center.
- 00:36 This is also known as the Gaussian curve, named after Carl Gauss,
- 00:40 who was a prominent mathematician in the early 1800s.
- 00:44 To be a little more precise, since the normal curve is symmetric,
- 00:49 there are an equal number of data points both above and
- 00:53 below the center value or mean, which is peaked.
- 00:56 This would not be true of a skewed distribution.
- 01:00 There's only one peak and is at the center of the curve.
- 01:04 They're not multiple peaks, nor is the top of the curve flat.
- 01:08 And that the ends of the curve approach the value of zero,
- 01:12 although theoretically it never reaches zero from a practical standpoint, it does.
- 01:17 The area under the curve represents all of the data points in the dataset.
- 01:21 This is what we mean by the form of a data distribution.
- 01:25 It is the form of the curve that it covers over all the data points.
- 01:30 The reason this curve is so important is that it represents normal
- 01:35 random variation or uncertainty in a process.
- 01:38 All the special cause of thing effects are removed, and
- 01:41 I'll talk about that more in another lesson.
- 01:43 What's left is the random uncertainty,
- 01:46 which will inevitably take on the shape of this normal curve.
- 01:49 Remember, in Lean Six Sigma, we want to remove variation.
- 01:53 So our goal is to remove the special cause variation and
- 01:57 limit the amount of the remaining normal random variation.
- 02:02 Let's look at some of the very important characteristics of this curve.
- 02:06 For this illustration, I'll set the mean or
- 02:09 center point to 0 value on our horizontal scale.
- 02:12 That horizontal scale will use the units of sigma or standard deviation.
- 02:18 Based upon the shape of a normal curve,
- 02:21 if we were to draw lines at one standard deviation above and below the mean,
- 02:26 68% of all the data points in our data set would fall within those two lines.
- 02:31 So roughly two-thirds of the data values in the standard normal curve,
- 02:36 meaning random variation,
- 02:37 are within one standard deviation of the average value of the data set.
- 02:43 Let's jump out to two standard deviations.
- 02:46 Now, I can see that 95.45% of the data points in the data set will
- 02:51 fall between -2 standard deviations and +2 standard deviations.
- 02:56 The -3 standard deviation the +3 standard deviation is a significant point for
- 03:01 us to consider.
- 03:02 You can see that this is, 99.73% of all the data points are within that range.
- 03:08 That means that from the standpoint of random variation,
- 03:11 we should only see three data points out of a thousand that go beyond these lines.
- 03:15 The reason the three sigma limits is of interest to us is the process
- 03:20 capability ratios and statistical process control strategies.
- 03:24 These were developed by Professor Schuhart, and were based upon
- 03:28 achieving this plus or minus 3 standard deviation level of performance.
- 03:32 More about that in our course on statistical process control.
- 03:36 Moving on out to +4 sigma to -4 sigma level, as you can see,
- 03:41 we're at a point now where 99.9937% of the data values are between those points.
- 03:49 Continuing out to plus or minus 5 sigma,
- 03:52 we go up to 99.999943 of all the data values.
- 03:56 And finally, at plus or minus 6 sigma, 99.9999998% are under the curve.
- 04:03 For all intents and purposes in the real world, that means everything.
- 04:08 We'll come back to these numbers in the Statistical Process Control course, and
- 04:13 focus in on what they mean for process stability and process control.
- 04:17 One last point, if you need the equation for the curve, here it is.
- 04:22 Let me wrap this up with some comments about the characteristics of a normal
- 04:26 distribution.
- 04:27 The normal distribution is an excellent model for random variation in nature.
- 04:32 This could apply to random variation on the inputs to a process,
- 04:36 such as variation in materials or environmental conditions.
- 04:40 It could also represent the random variation in the output of the process.
- 04:44 That could be variation in timing, quality, or costs.
- 04:48 This variation has no clearly assignable cause, it's just part of the system.
- 04:54 Now, if you go back and follow the item through the process, you can find
- 04:58 several factors that lead to particular final value of a process parameter, but
- 05:03 those factors can and do vary slightly.
- 05:05 And if the variation is within the normal range, it's called random variation.
- 05:10 We'll talk a lot more about this in the session on special cause and common cause.
- 05:15 The bottom line is that in order to change the level of variation,
- 05:19 there needs to be a fundamental change in the process, but
- 05:23 a great characteristic of this type of variation is that it is predictable.
- 05:28 We can calculate a mean and a standard deviation.
- 05:31 As I showed on the last slide, we can with confidence predict what percentages of
- 05:36 the process output will fall within the different standard deviations or
- 05:41 sigma levels.
- 05:42 This predictable center, spread, and shape will allow us then to set up
- 05:46 statistical process control charts to track our process performance and
- 05:50 be able to detect when abnormal conditions are occurring.
- 05:54 The normal distribution is used throughout the Lean Six Sigma process for
- 05:59 both analysis and control.
- 06:00 It is the most commonly found distribution,
- 06:06 and we will be referring to it often.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.