Locked lesson.
About this lesson
Data transformation can convert a Lean Six Sigma problem from a non-linear regression analysis into a linear regression which is often easier to understand and explain to the stakeholders. The most common transformation approach is the Box-Cox transformation. In this lesson, we demonstrate how this transformation works and discuss when to use it.
Exercise files
Download this lesson’s related exercise files.
Transformation Exercise.xlsx11.2 KB Transformation Exercise Solution.docx
85.4 KB
Quick reference
Box-Cox and Transformations
Transformations modify non-linear terms in a prescribed manner so that they can be treated as linear terms in a regression analysis. The most common transformation is the Box-Cox transformation.
When to use
Transformations are used when working with higher-order terms, especially when working with multiple regression analyses. By applying a transformation, a linear analysis can be done instead of a non-linear analysis.
Instructions
Many real-world effects that Lean Six Sigma teams encounter are characterized by non-linear behavior. It is more difficult to create a non-linear regression analysis than a linear one when doing the analysis by hand. Fortunately, there is statistical software that can assist with non-linear models. However, if your tool of choice is Excel, you do not have the ability to do non-linear regression analysis. In that case, a transformation of one or more factors can change the problem into a linear relationship which can then be solved.
The most common transformation used in Lean Six Sigma regression analysis is the Box-Cox transformation. Box-Cox organizes the transformation approach into a logical sequence that can be tried by hand to see if there is a suitable linear relationship. Box-Cox uses whole number integers to designate the exponent to be used in the transformation. A “2” designates that the factor should be squared and a “3” indicates it should be cubed. A “0.5” is a square root and a Box-Cox value of “0” is the natural log of the factor. Negative Box-Cox values are the same effects only with the factor in the denominator of a fraction. Therefore, a “-2” is 1/x2. Box-Cox can go as high as 5, but in practice, it seldom exceeds 2.
To use Box-Cox in Minitab, load your data in the normal manner. Then select
Stat → Regression → Regression → Fit Regression Model.
Set up the regression analysis with the response factor and the control factors. Then select the “Option” button. On the Option panel, there are radio buttons to select various Box-Cox values. This transformation will be applied to your response factor. To apply the transformation to the control factors, select Non-linear analysis from the Regression menu and enter the selected exponent or function for each control factor. If using Box-Cox and transforming your response factor (Y), you need to transform back to know the “real-world” values.
Multiple non-linear regression analyses often will include transformations to simplify them, resulting in either a non-linear analysis with one factor or a multiple linear analysis. If you have process knowledge that allows for a transformation to one of those methods, it will likely give you a better fit for the result. Whenever working with multiple non-linear analyses, the standard error is the preferred technique for determining if the fit is adequate.
SE =
where “n” is the sample size and “k” is the number of independent variables
Hints & tips
- Box-Cox is a framework for cataloging the transformation operators to be used. It is not a stand-alone mathematical operation.
- It may take several tries to find the best transformation.
- When reviewing the residual plots, if the “Versus Fit” plot has a definite shape to it, then use a Box-Cox transformation until the pattern becomes random.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.