Locked lesson.
About this lesson
Exercise files
Download this lesson’s related exercise files.
Full Factorial DOE Study Design.docx59.6 KB Full Factorial DOE Study Design - Solution.docx
92 KB
Quick reference
Full Factorial DOE Study Design
The design of a full factorial DOE is based upon the number of runs that must be done. The number of runs is based upon the control factors and any special considerations for improving the statistical analysis and accommodating real-world constraints.
When to use
When it has been determined to conduct a full factorial DOE study, one of the first two steps is to create the study design (this step is done in parallel with selecting factors). Once the study design is complete, the preparation of samples, creation of procedures, conduct of the tests and the analysis can be accomplished.
Instructions
The critical element of the full factorial DOE Study design is the identification of all the run configurations that must be done. Typically the study is done with two-level factors, although we will discuss higher numbers of levels in a later lesson. The two levels are designated with a plus one and minus one. Normally the plus one is the high level and the minus one is the low level. The levels are chosen to represent real world variation. For instance the levels might be the upper and lower tolerance limits for a parameter that is a control factor or it might be the upper and lower adjustment levels on a piece of processing equipment. If the control factor is a qualitative factor, the two levels indicate whether the factor is present of not.
Run Configuration
In full factorial DOE, a run must be done for every combination of high level and low level factor settings. The easy way to define these configurations and to be certain that you don’t accidentally test one configuration twice and another not at all is toe create a Yates matrix. This matrix defines the run configuration. It is done by establishing a column in the matrix for each control factor. The rows will represent the runs and the columns will show the factor configuration. In the first column alternate every other row between +1 and -1. In the second column, alternate every two rows between +1 and -1. In the third column alternate every four rows, and continue the pattern for as many factors as you have.
Run |
F#1 |
F#2 |
F#3 |
F#4 |
1 |
+1 |
+1 |
+1 |
+1 |
2 |
-1 |
+1 |
+1 |
+1 |
3 |
+1 |
-1 |
+1 |
+1 |
4 |
-1 |
-1 |
+1 |
+1 |
5 |
+1 |
+1 |
-1 |
+1 |
6 |
-1 |
+1 |
-1 |
+1 |
7 |
+1 |
-1 |
-1 |
+1 |
8 |
-1 |
-1 |
-1 |
+1 |
9 |
+1 |
+1 |
+1 |
-1 |
10 |
-1 |
+1 |
+1 |
-1 |
11 |
+1 |
-1 |
+1 |
-1 |
12 |
-1 |
-1 |
+1 |
-1 |
13 |
+1 |
+1 |
-1 |
-1 |
14 |
-1 |
+1 |
-1 |
-1 |
15 |
+1 |
-1 |
-1 |
-1 |
16 |
-1 |
-1 |
-1 |
-1 |
Replicates
A technique to improve the statistical accuracy of the solution is add replicates. This means that a second run is done for every configuration. This is not a repeat of the first run. It is a separate sample or setup, just using the same control factor configuration. This increases the number of data points and minimizes the effect of noise or an outlier. Obviously, this doubles the number of runs.
Center Points
Center points are another technique to improve statistical accuracy. Center points are a run where all the quantitative control factors are set at their mid-point between the high and low level. A caution with using center points. There is no middle level for a qualitative control factor, so if the study includes that type of factor, the number of center points must be doubled. One run will have the qualitative level at the high level and center points for quantitative factors. And the second run will have the qualitative factor at the low level and the use the center points for quantitative factors. The center points are used to determine the “Curvature” of the model. This is then used to determine the statistical significance of the analysis which is measured with a “P” value. The curvature is calculated by subtracting the average of all the DOE trial runs minus the average of the all the center points. These center point values do not provide enough data to create a quadratic formula. For that, a matrix with three-level factors would need to be created. These center points are normally spread evenly throughout the runs and are not randomized.
Blocking
Blocking is a technique for accommodating “real world” constraints. Sometimes when doing the test runs, there is an element or factor that determines how the tests are accomplished, but is not intended to be a control factor in the study. An example would be if the testing can only be done on weekends and it takes multiple weekends to complete all the runs. In a case like this, each weekend is setup to be a block. The statistical analysis will take the blocking into consideration when analysing for effects and attempt to cancel out any noise due to the blocking.
Hints & tips
- When determining what levels to use, make sure the levels are easy to set during the runs.
- Number your runs using the Yates method. When we get to test execution we will randomize the runs, but the Yates number will be the run identifier and can easily be used to check that the test is properly configured.
- Even though each test sample or run has a configuration number, the test operator should not know that number so that they are less likely to bias the outcome. That means the identifier must be hidden or detached when the actual experimental run occurs.
- Replicates add a lot more testing, but give much better analysis.
- Center points are useful if the testing will be occurring over a long time period and requiring many setups.
- Use blocking only when necessary. It has a tendency to decrease the quality of the statistical analysis. So if it is not needed, do not use it.
- 00:04 Hello, I'm Ray Sheen.
- 00:06 Well it's now time to talk about the design elements
- 00:10 of a full factorial DOE study.
- 00:12 In designing the study,
- 00:14 one must determine the configuration of each of the samples to be tested.
- 00:18 Typically in a full factorial DOE, every control factor has predetermined high and
- 00:23 low settings, and there's at least one sample that is tested with every possible
- 00:28 combination of the high and low control factor settings.
- 00:31 In full factorial DOE terminology,
- 00:34 the high setting is designated as the plus one state for the factor and
- 00:38 the low setting is designated as the minus one state for the factor.
- 00:43 As part of the design, it's very helpful to set meaningful factor levels.
- 00:47 By this I mean, levels that correspond to normal operations for
- 00:51 the product processor system.
- 00:53 Typical levels for a normally controlled factor would be the minimum and
- 00:57 maximum tolerance points.
- 00:59 If it is more of an environmental factor, use the typical extremes.
- 01:03 Check with operators to find out the upper and lower levels that have been observed.
- 01:07 For attribute factors, it's usually easy.
- 01:10 The plus one level is one state for the factor such as on and
- 01:13 the minus one level is the other state for the factor such as off.
- 01:18 In addition to the combination of all the factor levels, the study will often
- 01:21 include center points or replicates to improve the statistical analysis.
- 01:25 And more about that in just a few slides.
- 01:28 Based upon the study design, samples for testing must be created.
- 01:33 This is often the most expensive aspect of the deal we study.
- 01:36 Most of the samples that are created will not be usable for anything else.
- 01:40 And in fact, the testing may even be destructive testing.
- 01:44 Even if the DOE study is a process,
- 01:46 the process output often does not meet the specification for acceptable results.
- 01:52 By the way, that is expected in a DOE study,
- 01:54 we will have both good and bad results and
- 01:57 the statistical analysis of those results will help us to define the design space.
- 02:02 In order to do the statistical analysis, the results data must be paired with
- 02:06 a precise configuration of the control factors.
- 02:09 And if your process is currently under statistical control,
- 02:13 you may need to violate that statistical control in order to sort the data points
- 02:18 into a specific sequence, and you may want to minimize operator bias,
- 02:22 so to the degree possible, each test sample should appear identical.
- 02:26 That means that you will need some either detachable or
- 02:30 remote identification process to ensure that the data is paired correctly.
- 02:35 Let's take a minute to discuss the test sample configuration.
- 02:38 As stated before, the DOE statistical analysis relies on
- 02:41 precise test configurations and pairing of the data points to that configuration.
- 02:46 Every sample in the study must be tested for the statistical analysis to be valid.
- 02:51 Each run has this unique configuration of the control factor levels.
- 02:56 And you can quickly see that the more factors you have, the more experimental
- 03:00 test runs that you will need in order to test each combination of factor levels.
- 03:04 A common technique for creating the configuration is the Yates method.
- 03:09 In this method, the first factor is listed in the first column of a matrix with every
- 03:14 other line being either a plus or minus level for that factor.
- 03:17 The second factor is in the next column, and in this case, there are two lines with
- 03:22 the plus level and then switch to two lines with the minus level.
- 03:25 Switch back and forth until you complete the number of runs.
- 03:29 The third factor switches in blocks of four.
- 03:32 The fourth factor will switch in blocks of eight, and so on and so on.
- 03:36 This diagram illustrates this approach when there are three factors.
- 03:40 Let's now take a moment to discuss replication.
- 03:43 Replicates occur when a second sample is created for
- 03:46 every configuration and then tested.
- 03:48 It will double the number of test samples and test runs.
- 03:51 So it increases time and money, but there are some benefits.
- 03:55 Doubling the number of data points reduces the impact of any noise effects on a run.
- 04:01 Plus the additional data points will increase the statistical accuracy since
- 04:05 there is more data to work with.
- 04:07 Note it is not repetition using the same sample twice.
- 04:11 It is a unique sample or test setup.
- 04:13 Repetition or
- 04:14 repeating around with the same sample does not count as a replicant run.
- 04:19 Next is center points.
- 04:21 Center points are use to control for environmental factors and linearity.
- 04:25 Center points are runs where all of the quantitative
- 04:28 control factors are set at their midpoint between the high and low values.
- 04:33 Centering only applies to quantitative control factors.
- 04:36 If you have a qualitative control factor, you will need to do two center points.
- 04:40 One with the qualitative factor at the plus one setting and
- 04:44 one at the minus one setting.
- 04:46 The quantitative factors are the same setting for
- 04:49 both of those their center points.
- 04:51 Center points are used to determine the precision of the result by calculating
- 04:55 the curvature of the model.
- 04:57 The curvature is determined by taking the average
- 05:00 of all of the Yates method runs minus the average of the center point runs.
- 05:04 If the results are virtually the same, the model is linear at that point and
- 05:09 the p value is very low.
- 05:11 The center point helps us to understand the accuracy of the linear model,
- 05:15 but there's not enough data to create a full quadratic model.
- 05:18 Center points typically are not randomized in the study sequence.
- 05:22 Rather they occur at set points such as beginning, middle, and end,
- 05:26 to ensure that the testing is not impacted by environmental effects.
- 05:31 The final point to mention in this lesson of DOE study design is blocking.
- 05:36 Blocking separates runs into blocks based upon an environmental factor or
- 05:41 a nuisance factor.
- 05:42 To the extent that there is noise due to that factor,
- 05:45 the blocking minimizes its effect.
- 05:47 In my experience, blocking is often done for convenience.
- 05:51 Let's say I'm doing a DOE study on a process and I can only get time to do my
- 05:56 study runs on the weekend when the process is not engaged for normal operations.
- 06:01 But based upon the number of runs I have to do, it will take me three weekends.
- 06:06 Well, I'll do my testing then in three blocks, one for each weekend.
- 06:10 And if there's any variation that is due to the set up and
- 06:13 operation on different weekends, this will reduce that effect.
- 06:17 In some cases, we use blocking upon a qualitative factor.
- 06:21 But realize that that will remove that factor from the statistical analysis.
- 06:25 An example might be if we were doing testing in two different locations.
- 06:30 I view blocking as a means to accommodate real world factors that constrain
- 06:35 the ability of the DOE team to create and execute a perfect study.
- 06:40 Well given all these elements, the combination of highs and lows,
- 06:44 replication, center points, blocking, you can see that the decisions
- 06:49 about the design of the DOE study are not a trivial question.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.