Locked lesson.
About this lesson
These three tests are for multiple samples of non-normal data. Each test has its strengths and weaknesses. The appropriate test will depend upon what is known, or not known, about the data in the samples. The Minitab interface to accomplish each of these tests is similar. This lesson will explain the differences and show how to conduct the test and read the results.
Exercise files
Download this lesson’s related exercise files.
Mood's Median Exercise.xlsx10.8 KB Mood's Median Exercise Solution.docx
230.2 KB
Quick reference
Mood’s Median, Kruskal-Wallis, Friedman Tests
When multiple non-normal data samples are compared in a hypothesis test, there are several potential tests that can be used. The Mood’s Median, Kruskal-Wallis, and Friedman tests are typical tests used and each is best suited to different characteristics of the data.
When to use
Many Lean Six Sigma projects requiring hypothesis tests are based on non-normal data sets. Mood’s Median Test, Kruskal-Wallis Test, and Friedman Test are used with multiple data sets. The specific test to be used will depend upon the characteristics of the data.
Instructions
The form of the hypotheses for all three of these tests is the same.:
H0: median1 = median2 = median3
Ha: median1≠median2 ≠median3
Mood’s Median Test
The Mood’s Median Test is appropriate for use with multiple data samples whose non-normal data sets have a similar shape – such as skewed left, skewed right, or bathtub. The test will work with multiple data samples. This test is particularly robust with respect to outliers. The test cannot be accomplished with Excel.
- Minitab:
- All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data sets into one column.
- Stat > Nonparametrics > Mood’s Median Test
- Select the data column for the Response field
- Select the data identified column for the Factor field
Kruskal-Wallis Test
The Kruskal-Wallis Test is appropriate for use with multiple non-normal data samples. This test is essentially an ANOVA test for non-normal data. The data items should be continuous (not discrete). The data samples do not need to have similar shapes as with the Mood’s Median Test. This test is sensitive to outliers. This test cannot be accomplished with Excel.
- Minitab:
- All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
- Stat > Nonparametrics > Kruskal-Wallis
- Select the data column for the Response field
- Select the data identified column for the Factor field
Friedman Test
The Friedman Test is the most complex of the non-normal data hypothesis tests that we use with multiple data samples. The Friedman Test works with large blocks of data. It essentially compares the data within the blocks and then between the blocks. In this regard, it is a hybrid of the Paired T Test and an ANOVA or Kruskal-Wallis Test. The minimum sample size you should use in the Friedman Test is 30 data items. An additional attribute of the test setup is to be careful how you choose your blocks. Since there are two identifiers for each data point (data block and data item), switching those two may create a different result. This test cannot be accomplished with Excel.
- Minitab:
- All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
- Stat > Nonparametrics > Friedman
- Select the data column for the Response field
- Select the data identified column for the Treatment field
- Select the data identified column for the Block field
Hints & tips
- Stacking data in one column is very easy in Minitab using the stack command. I load my data into Minitab with a separate column for each sample then stack once everything is in. With the Friedman test, the stacked column can easily have hundreds of entries. Loading the data first by sample columns allows me to easily find and fix data problems.
- The Friedman test has two identifier columns for the data. One is called the treatment and is similar to the Factor column in Kruskal Wallis; the second is the Block identifier.
- Run a normality check to see if your data is non-normal.
- Create a histogram of the data in each sample to see the shape of the data set.
- 00:04 Hello, I'm Ray Sheen.
- 00:05 Non-normal data sets can sometimes get really squirrely.
- 00:09 Fortunately, we have three tests for use with multiple non-normal data sets,
- 00:15 and different ones apply to different squirrely conditions.
- 00:19 Those tests are the Mood's Median test, Kruskal-Wallis test and Friedman's test.
- 00:27 >> Once more for the hypothesis test decision tree,
- 00:30 we are still working with non-normal data.
- 00:33 And now we want to consider the tests for cases when there are two or more samples.
- 00:39 The first test to discuss is Mood's median test.
- 00:42 Mood's median considers the equality of the medians from two or more data samples.
- 00:48 The Mood's median is a quick and simple test, but
- 00:51 it also has a few assumptions that must be met.
- 00:55 Mood's median is most effective when data samples are independent and
- 00:58 the distributions are roughly the same shape.
- 01:02 One distinct advantage of the Mood's median test is that it is robust with
- 01:06 respect to outliers in the data.
- 01:09 The hypotheses are straightforward.
- 01:11 The null hypothesis states that the medians of all
- 01:14 the sample sets are statistically equal.
- 01:18 The alternative states that one or more of the medians are not equal.
- 01:22 Since there are multiple data samples, there's no designation of larger than or
- 01:26 smaller than that we use when there are only two data samples.
- 01:29 So let's look at setting this up on our Minitab.
- 01:32 Use the stat pull-down menu, select Non-parametric and then Mood's median, and
- 01:37 this panel will appear.
- 01:39 Mood's median requires that the data from all the samples be in the same column.
- 01:44 However, there is a second column that has the sample designator that is
- 01:49 normally adjacent to the data column.
- 01:52 The easy way to create this column is to just stack your data from one sample below
- 01:57 the previous sample.
- 01:59 Minitab will actually do this quickly for
- 02:02 you using the stack command under the data pulldown menu.
- 02:06 So under the column name with the data, in the response field and
- 02:10 the column name with the data sample identifier and the factor entry box.
- 02:16 Now let's look at Kruskal-Wallis.
- 02:18 Think of Kruskal Wallis like it is an advanced version of
- 02:21 ANOVA that works well with non-normal data.
- 02:24 There are some key data sample set characteristics that must be met when
- 02:27 using Kruskal-Wallis.
- 02:29 Again the data samples should be independent,
- 02:32 the data should be continuous data as compared to discrete data.
- 02:36 However, the data sample sets do not need to have the same shape or distribution.
- 02:41 So in that regard, they have an advantage over Mood's median.
- 02:45 But it is sensitive to outliers, so in that way Mood's median has the advantage.
- 02:51 The hypothesis statements are the same as with Mood's median test.
- 02:55 The null hypothesis states that the medians are statistically equal and
- 02:59 the alternative hypothesis states that they are not equal to each other.
- 03:04 Doing the Kruskal Wallis in Minitab is very similar to Mood's median test.
- 03:08 Go to the stat pulldown menu, select Nonparametric, and
- 03:12 then select Kruskal-Wallis, and this panel will pop up.
- 03:15 Just like with the Mood's median, all the data must be in the same column and
- 03:20 adjacent column should have the factor that is being used to designate the sample
- 03:25 subsets.
- 03:26 Each of these columns are entered into the appropriate field in Minitab.
- 03:31 And now for the Friedman test.
- 03:33 The Friedman test is a bit weird.
- 03:35 It works with blocks of paired non-normal data groups.
- 03:39 Think of it as hybrid of ANOVA, paired t-test and Kruskal-Wallis.
- 03:44 With all that it's doing, Friedman needs samples with many data points,
- 03:49 a minimum of 30 and more is better.
- 03:51 The null hypothesis and
- 03:53 alternative hypothesis are the same as with Mood's median and Kruskal-Wallis.
- 03:58 The null is that the medians are equal, and
- 04:00 the alternative is that they are statistically not equal.
- 04:04 Minitab is a bit different.
- 04:06 You start at the same, go to the stat pull down menu, select Nonparametric,
- 04:10 and then select Friedman, and this is the panel you get.
- 04:13 Once again, all the data is in the same column, but
- 04:16 now you have one adjacent column with the sample category designator or
- 04:21 treatment, and one with the block designator.
- 04:23 You can probably guess this test is often used with clinical trials of medical
- 04:28 treatments and pharmaceuticals, but you may have a need for
- 04:32 it in Lean Six Sigma projects if working across multiple locations or
- 04:36 product lines on your project.
- 04:38 Just designate all the appropriate columns.
- 04:42 So let's consider the result of these three tests.
- 04:46 In this case,
- 04:47 I'm going to be using time trials from the luge event at the 2014 Winter Olympics.
- 04:52 There are four trials and the data is not normally distributed.
- 04:56 Mood's median shows the median of each of the four trials along with a confidence
- 05:01 interval on a small scale in the session window.
- 05:04 The P value was 0.001, reject the null hypothesis.
- 05:09 Each trial was definitely different.
- 05:11 Kruskal-Wallis also shows the median values and an average rank.
- 05:16 The ranking clearly shows that the times in samples 1 and
- 05:20 2 are higher than with samples 3 and 4.
- 05:23 Again, P value is very low, in this case it's 0.
- 05:26 So reject the null hypothesis.
- 05:29 The Friedman's test also shows the median value, and
- 05:32 I don't know why it has an additional significant digit showing.
- 05:36 In this test result, the sum of ranks, which is a ranks measure adjusted for
- 05:41 the Friedman block characteristics, still shows the time trials 1 and
- 05:46 2 were much different from 3 and 4, and again, the P value is 0.
- 05:50 So reject the null hypothesis.
- 05:53 But let me point out that this Friedman test divided time versus trial and
- 05:57 block by competitor.
- 05:59 In other words, the dominant characteristic is the trial and competitor
- 06:05 blocks were used to minimize the effects of good teams versus bad teams.
- 06:10 In this result, I switch trial and competitor between the treatment and
- 06:14 block parameter.
- 06:15 We have a very different result.
- 06:18 In this case, there will be a median for each competitor.
- 06:22 I just left some off the sheet for readability, and there is a rank for
- 06:26 each competitor.
- 06:28 Notice that the P value is 0.572.
- 06:31 So we would fail to reject the null hypothesis in this case.
- 06:34 But that's because the hypotheses between these two runs of the Friedman test is
- 06:39 different.
- 06:41 In the first one,
- 06:42 the null hypothesis is that the median time in each trial is the same.
- 06:47 In the second case, the null hypothesis is that the median time for
- 06:52 each competitor is the same.
- 06:55 There's another concern with the second Friedman test.
- 06:58 There are only four data points for each competitor, and
- 07:01 we said that we needed at least 30 points.
- 07:04 So this version of the test should be considered invalid until more time
- 07:09 trials are conducted and the times recorded.
- 07:13 >> Regardless of the nature of your non-normality,
- 07:16 there will be a hypothesis test that will work.
- 07:19 Look at your data, and then select either Mood's median, Kruskal-Wallis, or
- 07:23 Friedman's tests.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.