Locked lesson.
About this lesson
One of the most important criteria for selecting a hypothesis test is based upon whether the data being analyzed is normal on not normal. The normality question does not prove or disprove the hypothesis, rather it determines the type of statistical test that should be performed. This lesson reviews the concept of normality and how to determine it.
Exercise files
Download this lesson’s related exercise files.
Normal Exercise.xlsx10.2 KB Normal Exercise Solution.docx
96.1 KB
Quick reference
Normal Distribution
Hypothesis tests can be done with either normal or non-normal data. But different tests are used. Therefore, a Lean Six Sigma team must be able to determine if their data is normal or non-normal so that they can choose the correct hypothesis test
When to use
Prior to actually conducting the hypothesis test, the data should always be checked to determine if it is normal or non-normal so as to be able to choose the correct test.
Instructions
The normal distribution, which is also called the Gaussian distribution or the bell-shaped curve, is characterized by a symmetric distribution. There are as many data points above the mean as below the mean. Also, there is a central tendency. The points are clustered near the mean. That is why when it is graphed, the center is high and the edges or tails are very small and approach zero.
A normal data distribution represents random variation that occurs within every physical system.
Hypothesis testing can be done with either normal or non-normal data. There are different tests that are done depending on the type of data. That is why this is a key question that is asked in the Hypothesis Testing Decision Tree. Depending upon this answer, a completely different set of tests will be involved.
Normality is determined using basic descriptive statistics of the data sample. When doing that test, several parameters are determined:
- Mean – the average of all the data points. This is often used in Hypothesis tests with normal data.
- Median – the midpoint of the data points. This is often used in Hypothesis tests with non-normal data.
- Standard Deviation – a measure of the spread or width of the distribution. This measure and Variance, which is the standard deviation squared, are often used in hypothesis testing.
- Skewness – this is a measure of symmetry. A symmetrical distribution will have a skewness value of zero. The distribution is considered normal as long as the value is between -.8 and +.8.
- Kurtosis – this is a measure of whether the tails are “heavy” or “light.” When they are light, they taper down to near zero on the upper and lower edges of the distribution. Kurtosis can be measured in several ways. The method used in Excel is “Sample Excess Kurtosis.” This measure has the advantage that a Normal curve score will be zero – just like with Skewness. In this case, values from -0.8 to +0.8 are still considered Normal.
Normalcy can be checked in either Excel or Minitab.
- Excel:
- Select “Data Analysis” on the “Data” ribbon.
- Select “Descriptive Statistics” and click “OK.”
- Enter the range for your data in “Input Range.”
- Select where you want the results – in a new worksheet or in a location in the current worksheet.
- Select “Summary Statistics.”
- Click on “OK.”
- View the results and analyze for normality.
- Minitab:
- “Stat” Menu
- Select “Basic Statistics”
- Select “Normality Test”
- Enter the name of the column with your sample data in the “Variable” window
- Ensure the “Anderson-Darling” box is checked
- View the results and check for normality.
Excel will provide a table with the statistical values and you can then decide if the data is normal or non-normal. Minitab will provide a plot of the data against a normal line and provide a P value that can be used to determine if the data is normal.
Hints & tips
- If the Data Analysis Menu does not show on your Data ribbon in Excel, you need to add the Analysis ToolPak Add-in. Go to File menu, select Options, then select Add-in. Enable the Analysis ToolPak add-in. This is a free feature that is already in Excel, you just need to enable it. You may need to close and reopen Excel for the menu to appear.
- If you don’t have Minitab, consider downloading a free trial. Minitab normally has a 30-day free trial period. All of the hypothesis tests will be demonstrated in Minitab. Approximately half of the tests will also be demonstrated in Excel, but the other half are not available in Excel. If you want to practice doing all the tests, you will need Minitab. Be sure you complete the course within 30 days before your trial expires.
- When using Minitab, data must always be entered into columns - never into rows. Minitab uses column names for identifying data sets.
- Data can be copied and pasted back and forth between Excel and Minitab. I often collect data in Excel because that is easier for data collection, and then copy it to Minitab for analysis.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.