Locked lesson.
About this lesson
Exercise files
Download this lesson’s related exercise files.
Full Factorial DOE Study Design.docx59.6 KB Full Factorial DOE Study Design - Solution.docx
92 KB
Quick reference
Full Factorial DOE Study Design
The design of a full factorial DOE is based upon the number of runs that must be done. The number of runs is based upon the control factors and any special considerations for improving the statistical analysis and accommodating real-world constraints.
When to use
When it has been determined to conduct a full factorial DOE study, one of the first two steps is to create the study design (this step is done in parallel with selecting factors). Once the study design is complete, the preparation of samples, creation of procedures, conduct of the tests and the analysis can be accomplished.
Instructions
The critical element of the full factorial DOE Study design is the identification of all the run configurations that must be done. Typically the study is done with two-level factors, although we will discuss higher numbers of levels in a later lesson. The two levels are designated with a plus one and minus one. Normally the plus one is the high level and the minus one is the low level. The levels are chosen to represent real world variation. For instance the levels might be the upper and lower tolerance limits for a parameter that is a control factor or it might be the upper and lower adjustment levels on a piece of processing equipment. If the control factor is a qualitative factor, the two levels indicate whether the factor is present of not.
Run Configuration
In full factorial DOE, a run must be done for every combination of high level and low level factor settings. The easy way to define these configurations and to be certain that you don’t accidentally test one configuration twice and another not at all is toe create a Yates matrix. This matrix defines the run configuration. It is done by establishing a column in the matrix for each control factor. The rows will represent the runs and the columns will show the factor configuration. In the first column alternate every other row between +1 and -1. In the second column, alternate every two rows between +1 and -1. In the third column alternate every four rows, and continue the pattern for as many factors as you have.
Run |
F#1 |
F#2 |
F#3 |
F#4 |
1 |
+1 |
+1 |
+1 |
+1 |
2 |
-1 |
+1 |
+1 |
+1 |
3 |
+1 |
-1 |
+1 |
+1 |
4 |
-1 |
-1 |
+1 |
+1 |
5 |
+1 |
+1 |
-1 |
+1 |
6 |
-1 |
+1 |
-1 |
+1 |
7 |
+1 |
-1 |
-1 |
+1 |
8 |
-1 |
-1 |
-1 |
+1 |
9 |
+1 |
+1 |
+1 |
-1 |
10 |
-1 |
+1 |
+1 |
-1 |
11 |
+1 |
-1 |
+1 |
-1 |
12 |
-1 |
-1 |
+1 |
-1 |
13 |
+1 |
+1 |
-1 |
-1 |
14 |
-1 |
+1 |
-1 |
-1 |
15 |
+1 |
-1 |
-1 |
-1 |
16 |
-1 |
-1 |
-1 |
-1 |
Replicates
A technique to improve the statistical accuracy of the solution is add replicates. This means that a second run is done for every configuration. This is not a repeat of the first run. It is a separate sample or setup, just using the same control factor configuration. This increases the number of data points and minimizes the effect of noise or an outlier. Obviously, this doubles the number of runs.
Center Points
Center points are another technique to improve statistical accuracy. Center points are a run where all the quantitative control factors are set at their mid-point between the high and low level. A caution with using center points. There is no middle level for a qualitative control factor, so if the study includes that type of factor, the number of center points must be doubled. One run will have the qualitative level at the high level and center points for quantitative factors. And the second run will have the qualitative factor at the low level and the use the center points for quantitative factors. The center points are used to determine the “Curvature” of the model. This is then used to determine the statistical significance of the analysis which is measured with a “P” value. The curvature is calculated by subtracting the average of all the DOE trial runs minus the average of the all the center points. These center point values do not provide enough data to create a quadratic formula. For that, a matrix with three-level factors would need to be created. These center points are normally spread evenly throughout the runs and are not randomized.
Blocking
Blocking is a technique for accommodating “real world” constraints. Sometimes when doing the test runs, there is an element or factor that determines how the tests are accomplished, but is not intended to be a control factor in the study. An example would be if the testing can only be done on weekends and it takes multiple weekends to complete all the runs. In a case like this, each weekend is setup to be a block. The statistical analysis will take the blocking into consideration when analysing for effects and attempt to cancel out any noise due to the blocking.
Hints & tips
- When determining what levels to use, make sure the levels are easy to set during the runs.
- Number your runs using the Yates method. When we get to test execution we will randomize the runs, but the Yates number will be the run identifier and can easily be used to check that the test is properly configured.
- Even though each test sample or run has a configuration number, the test operator should not know that number so that they are less likely to bias the outcome. That means the identifier must be hidden or detached when the actual experimental run occurs.
- Replicates add a lot more testing, but give much better analysis.
- Center points are useful if the testing will be occurring over a long time period and requiring many setups.
- Use blocking only when necessary. It has a tendency to decrease the quality of the statistical analysis. So if it is not needed, do not use it.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.