Locked lesson.
About this lesson
A common investigation in Lean Six Sigma problem solving is to determine if two factors are correlated. This insight will often point to an underlying cause of the problem. This lesson explains how to do correlation analysis using both Excel and Minitab. It also includes a discussion of the Pearson correlation coefficient.
Exercise files
Download this lesson’s related exercise files.
Correlation Exercise.xlsx10.6 KB Correlation Exercise Solution.docx
220.2 KB
Quick reference
Correlation
Correlation is the condition when two factors are related. When the value of one changes, the value of the other will also change.
When to use
Use correlation with continuous data. It is used to determine if the value of two factors is related.
Instructions
There are two types of correlation, positive and negative. Positive correlation is the condition where as one factor increases, the other factor increases. Negative correlation is the condition where as one factor increases the other factor decreases. In perfect correlation, the rate of increase or decrease is always the same. In real-world applications, the rate will often vary.
A hypothesis test for correlation is often used in the Analysis phase of a project to determine which factors are related. The Null hypothesis states that the factors are not related. The Alternative states they are related and sometimes will even indicate whether that relationship is positive or negative. Be careful assuming too much with correlation. Correlation does not mean causation. The factors may be related because they are moving the same way, but that movement may not be caused by one of those factors but rather by another factor that you have not included in your analysis.
The degree of correlation is measured with the Pearson correlation coefficient. The Pearson will vary from -1 to +1. Minus 1 is a perfect negative correlation. Plus 1 is perfect positive correlation. Zero is no correlation. A Pearson value above 0.8 or below -0.8 is always statistically significant. A Pearson value between -0.2 and 0.2 is not statistically significant. If the Pearson value is between -0.8 and -0.2 or 0.2 and 0.8 you should also check the P value to determine significance.
Correlation is often illustrated by using a scatterplot or scatter diagram of the two factors. Positive correlation will have an upward slope. Negative correlation will have a downward slope. No correlation will either be a scatter with no obvious relationship or a horizontal or vertical line, showing that one factor is not impacting the other.
Both Excel and Minitab will check correlation. There must be the same number of values in each data column or row. The columns or rows should be aligned so that it is clear which values go with which point.
- Excel:
- Data Analysis
- Correlation
- Enter the range of the independent and dependent variables.
- Minitab:
- Stat
- Basic Statistics
- Correlation
- Select the data columns that you want to check for correlation.
Hints & tips
- Use the scatter diagram or scatter plot graphing function to illustrate the correlation.
- Minitab provides a P value in addition to the Pearson value. Excel does not.
- When working with Minitab, data must be in columns, not rows.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.