Locked lesson.
About this lesson
Simple linear regression analysis creates an equation that correlates two factors. This equation assists in understanding problems, and it can also be used to manage the problem or process going forward. This lesson shows how to calculate this line with the help of either Excel or Minitab.
Exercise files
Download this lesson’s related exercise files.
Simple Linear Regression Exercise.xlsx10.5 KB Simple Linear Regression Exercise Solution.docx
201.6 KB
Quick reference
Simple Linear Regression
Simple linear regression is the creation of a formula that shows the straight-line relationship between two correlated variables. One variable is the independent variable and the other is the dependent variable.
When to use
Once correlation is established between two factors, if those factors are continuous variables, a simple linear regression line formula can be created. This formula can be used to determine the effect of the independent variable on the dependent variable during the Analyse Phase and for predicting process performance when designing a solution during the Improve phase.
Instructions
Once correlation has been established between two continuous variables, then a simple linear regression line can be determined. This line is in the format of:
y = a +bx + ε.
In this equation, “x” is the independent variable and “y” is the dependent variable. Also, “b” is the slope of the line and represents the actual correlation relationship. “a” is the y-intercept for the line and is needed to establish the correct values in order to use the line for prediction. The “ε” value is the sum of the residuals and should be equal to zero if this is a “best fit” equation.
In real life, the actual data points are seldom exactly on the line, but when there is high correlation, the data points will be close to the line.
While this line is valuable for both investigating root causes and predicting performance when designing a solution, there are some limitations. Mathematically, the line extends in either direction to infinity. In reality, there are almost always limits. For instance, an analysis found that the amount of study time by a student was correlated with the student’s score on the test, so a simple linear regression line was created. However, a student cannot study for negative time, so the lower limit on the independent variable was zero, Furthermore, a student could not get better than a perfect score on the test, so the upper limit on the dependent variable was 100%.
Both Excel and Minitab will determine the coefficients needed to create a simple linear regression line.
- Excel:
- Data Analysis
- Regression
- Enter the range of the data, similar to what was done to check correlation
- Minitab
- Stat
- Regression
- Fitted Line Plot
- Enter the column for the Y variable and the X variable
- Ensure “Linear” is selected
Hints & tips
- Always check correlation first. Excel and Minitab can calculate a line even when there is not correlation. But that line is meaningless. Since there is no correlation, the next y value will not truly be related to the x value and the regression line is not able to predict the next y value.
- The Excel regression function provides more information than just the linear regression line coefficients. We will discuss the other information with the appropriate hypothesis test.
- This solution is a linear regression line. The best fit may be a non-linear line, which will be discussed in another lesson. Check the P value and the graph of your data to determine if a linear regression is the best fit.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.