Locked lesson.
About this lesson
Many of the problems encountered in Lean Six Sigma projects do not have the straight-line correlation effect that we discussed with simple linear regression or multiple regression analysis. The relationship is better modeled by an exponential curve, a parabola, or other non-linear relationship. This lesson will use Minitab to assist in determining the best model to predict performance.
Exercise files
Download this lesson’s related exercise files.
Non-linear Regression Exercise.xlsx10.7 KB Non-linear Regression Exercise Solution.docx
88.4 KB
Quick reference
Non-linear Regression
Non-linear regression analysis is the creation of a regression equation with higher-order terms or multi-variate terms. These are often better representations of real-world effects than linear regression models.
When to use
Start your regression analysis with either a simple linear regression or multiple linear regression depending on the number of independent variables. If the residuals analysis is unacceptable, then switch to a non-linear analysis until the residuals have improved.
Instructions
Linear regression assumes that the rate of change in the independent variable is creating a change in the response variable at a constant rate. This can be represented by a straight-line relationship. However, in many cases the real-world effect is not a constant rate change, rather it varies. This is represented by a curved line plot of the relationship between the variables. There are many real-world effects that Lean Six Sigma teams encounter that have varying relationship rates. These include system degradation due to wear and tear often accelerates near the end of life; pressure and temperature effects on materials will vary especially when the material is close to changing state; and electronics often will saturate at the low or high end of their performance spectrum causing the performance line to flatten.
When non-linear effects are present, they can usually be modeled with a higher-order term (squared or cubic), an exponential term, a logarithmic term, or a mixed variable term. – meaning a term that tracks the interaction effect of two otherwise independent variables.
Excel will not create a non-linear regression, but Minitab can do it in several ways.
- If the nature of the non-linear relationship is already known, you can select “Non-Linear” in the Regression menu and enter the relationship directly.
- If there is only one variable, you can select the “Fitted Line Plot” option in the Regression menu and they select the level of the higher-order term you want to include
- If there are multiple terms, you can select “Fit Regression Model” in the Regression submenu. In this case, you can use the Model button to then select interaction terms and higher terms to include. Or you can select the Options button to enable a Box-Cox transformation which will try multiple higher-order terms to determine which provides the best fit.
Regardless of the method selected, check the Residuals to be certain the solution is appropriate. When doing multiple linear regression, the P value and R-squared value are not appropriate terms to check for goodness of fit. The appropriate measure is the Mean Squared Error (MSE). The formula for MSE is:
MSE = 1/n Ʃ(Yi – Y i)2
Hints & tips
- Remember correlation is not causation. You may have missed a term in your analysis so be sure to include all possible terms.
- However, adding terms adds uncertainty to the analysis. You need at least ten points for every term in the equation and that includes higher order and mixed interaction terms.
- You may need to iterate through several combinations to find a regression that has acceptable residuals.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.