Retired course
This course has been retired and is no longer supported.
About this lesson
Exercise files
Download this lesson’s related exercise files.
Mood's Median, Kruskal-Wallis, and Friedman.xlsx10.6 KB Mood's Median, Kruskal-Wallis, and Friedman - Solution.docx
231.1 KB
Quick reference
Mood's Median, Kruskal-Wallis, and Friedman
When multiple non-normal data samples are compared in a hypothesis test, there are several potential tests that can be used. The Mood’s Median, Kruskal-Wallis, and Friedman tests are typical tests used and each is best suited to different characteristics of the data.
When to use
Many Lean Six Sigma projects requiring hypothesis tests are based upon non-normal data sets. The Mood’s Median Test, Kruskal-Wallis Test, and Friedman Test are used with multiple data sets. The specific test to be used will depend upon the characteristics of the data.
Instructions
Mood’s Median Test
The Mood’s Median Test is appropriate for use with multiple data samples whose non-normal data sets have a similar shape – such as skewed left, skewed right, or bathtub. The test will work with multiple data samples. This test is particularly robust with respect to outliers. The test cannot be accomplished with Excel.
- Minitab:
- All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
- Stat > Nonparametrics > Mood’s Mediam Test
- Select the data column for the Response field
- Select the data identified column for the Factor field
Kruskal-Wallis Test
The Kruskal-Wallis Test is appropriate for use with multiple non-normal data samples. This test is essentially an ANOVA test for non-normal data. The data items should be continuous (not discrete). The data samples do not need to have similar shapes as with the Mood’s Median Test. This test is sensitive to outliers. This test cannot be accomplished with Excel.
- Minitab:
- All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
- Stat > Nonparametrics > Kruskal-Wallis
- Select the data column for the Response field
- Select the data identified column for the Factor field
Friedman Test
The Friedman Test is the most complex of the non-normal data hypothesis tests that we use with multiple data samples. The Friedman Test works with large blocks of data. It essentially compares the data within the blocks and then between the blocks. In this regard it is a hybrid of the Paired T Test and an ANOVA or Kruskal-Wallis Test. The minimum sample size you should use in the Friedman Test is 30 data items. This test cannot be accomplished with Excel.
- Minitab:
- All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
- Stat > Nonparametrics > Friedman
- Select the data column for the Response field
- Select the data identified column for the Factor field
Hints & tips
- Stacking data in one column is very easy in Minitab using the stack command. I load my data into Minitab with a separate column for each sample then stack once everything is in. With the Friedman test, the stacked column can easily have hundreds of entries. Loading the data first by sample columns allows me to easily find and fix data problems.
- Run a normality check to see if your data is non-normal.
- Create a histogram of the data in each sample to see the shape of the data set.
- 00:04 Hello, I'm Ray Sheen.
- 00:06 Non-normal data can sometimes get really.
- 00:10 Fortunately, we have three different tests that we can use when we have various
- 00:15 non-normal data.
- 00:16 Those are the Mood's Median Test, the Kruskal-Walis Test, and
- 00:20 the Friendmann Test.
- 00:23 Once more for the Hypothesis Test Decision Tree.
- 00:26 We're still working with non-normal data, but now we want to consider the test for
- 00:30 the cases where there are two or more samples.
- 00:33 The first test is Mood's Median Test.
- 00:37 Mood's Median considers the quality of the medians form two or more data samples.
- 00:41 The Mood's Median is a quick and simple test, but
- 00:44 also has a few assumptions that must be met.
- 00:46 Mood's Median is most effective when data samples are independent, and
- 00:50 their distributions are roughly the same shape.
- 00:54 One distinctive advantage of Mood's Median Test is that it is robust with respect to
- 00:58 outliers in the data.
- 00:59 The hypothesis are straight forward.
- 01:01 The null hypothesis states that,
- 01:03 the medians of all the sample sets of data are statistically equal.
- 01:07 The alternative hypothesis says that, the medians are not equal.
- 01:11 Since there are multiple data sample sets, there's no designation for larger than or
- 01:15 smaller than, that we would use when there are only two data samples.
- 01:20 So let's look at setting this up on Minitab.
- 01:22 Use the Stat pulldown menu.
- 01:24 Select Nonparametrics, and then Mood's Median, and this panel will appear.
- 01:29 Mood's Median requires that all data from all the samples be in the same column.
- 01:34 However, there is a second column that has the sample designator
- 01:37 that is normally adjacent to the data column.
- 01:40 The easy way to create this column is just to stack the data from each of the samples
- 01:44 one above the other.
- 01:45 Meaning, to actually do this quickly for
- 01:47 you using the stack command under the data pull down menu.
- 01:51 So enter the column name with the data in the response field and
- 01:55 the column name with data sample identifier in the factor entry box.
- 02:01 Now, let's look at the Kruskal-Wallis.
- 02:03 Think of Kruskal-Wallis like it is an advanced version of ANOVA
- 02:07 that works well with non-normal data.
- 02:10 There are some key data sample set characteristics
- 02:13 that must be met when using Kruskal-Wallis.
- 02:15 Again, the data samples should be independent.
- 02:18 The data should be continuous data as compared with discrete data.
- 02:21 However, the data sample sets do not need to have the same shape or distribution.
- 02:26 So in that regard, it has an advantage over Mood's median, but it is sensitive to
- 02:31 outliers so in that characteristic, Mood's median has the advantage.
- 02:35 The hypothesis statements are the same as we had with Mood's median test.
- 02:39 The null hypothesis state that, the medians are statistically equal and
- 02:42 the alternative hypothesis states that they are not equal to each other.
- 02:46 During the Kruskal-Wallis test in Minitab is very similar to that of
- 02:50 the Moods Median Test.
- 02:51 Go to the Stat pulldown menu, select Nonparametrics, and
- 02:54 then select Kruskal-Wallis and this panel will pop up.
- 02:58 And just like with Mood's median, all the data must be in the same column and
- 03:02 the adjacent column should have the factor
- 03:05 that is being used to designate the sample subsets.
- 03:08 Each of these columns are entered into your appropriate field in minitab.
- 03:12 And now for the Friedman test.
- 03:14 The Friedman test is a bit weird,
- 03:16 it works with blocks of paired non normal data groups.
- 03:19 Think of it as a hybrid of ANOVA, paired T-Test and Kruskal-Wallis.
- 03:24 With all of its doing, Friedman need samples with many data points,
- 03:29 a minimum of 30 and more is better.
- 03:31 The Null Hypothesis and
- 03:33 Alternative Hypothesis are the same with Moods Median and Kruskal-Wallis.
- 03:37 The null is that the medians are equal, and
- 03:39 the alternative is that they are not statistically equal.
- 03:43 Minitab is a little bit different.
- 03:45 You still start the same way, go to the Stat pulldown menu, select Nonparametrics,
- 03:49 and then select Friedman, and this is the panel you get.
- 03:53 Once again, all the data is in the same column, but
- 03:56 now you have one column with the sample category data designator or
- 04:01 treatment, and one with the block data designator.
- 04:04 You can probably guess that this test is used primarily with
- 04:07 clinical test of medical treatments and pharmaceuticals.
- 04:10 But you may have a need for it in a Lean Six Sigma project.
- 04:14 If working across multiple locations or
- 04:16 product lines on your project, just designate the appropriate columns.
- 04:21 So let's compare the results of these three tests.
- 04:23 In this case,
- 04:24 I'm using the time trials from the luge event in the 2014 Winter Olympics.
- 04:29 There are four trials and the data is not normally distributed.
- 04:33 Mood's Median shows that the median of each of the four trials
- 04:37 along with the confidence internal on a small scale in the session window.
- 04:41 The P value was 0.001.
- 04:44 Reject the null hypothesis.
- 04:46 Each trial was definitely different.
- 04:47 Kruskal-Wallis shows the median values and average rank.
- 04:52 The ranking clearly shows that the times and samples one and
- 04:55 two are higher than with samples three and four.
- 04:59 Again, the P value is very low, in this case it's zero, so
- 05:02 reject the null hypothesis.
- 05:04 The Friedman Test also shows the median.
- 05:07 Interestingly, and I don't know why, it has an additional significant digit
- 05:11 showing In this test result the sum of ranks which is a ranks measure adjusted
- 05:16 for the Friedman Block characteristic still shows that time trials one and
- 05:20 two are much different than three and four.
- 05:23 And again our P value is 0.
- 05:25 So reject the null, but let me point out
- 05:28 the Friedman Test did the time versus trial and block by the competitor.
- 05:32 In other words, the dominant characteristic is the trial and
- 05:35 competitor blocks were used to minimize the effect of good teams versus bad teams.
- 05:40 But now in this result, I switched the trial and
- 05:43 the competitor between the treatment and block parameters.
- 05:46 We have a very different result.
- 05:47 The data is the same, but in this case there will be a median for
- 05:52 each competitor.
- 05:53 I just left some off the sheet for readability, and there's a rank for
- 05:57 each competitor, notice that the P value is 0.572, so
- 06:01 we will fail to reject the null hypothesis in this case, but that's
- 06:05 because the hypothesis between these two runs of Freedman Test is different.
- 06:10 In the first case the null hypothesis is at the median time in each trial is
- 06:15 the same.
- 06:16 In the second case the null hypothesis is st the median time for
- 06:20 each competitor was the same..
- 06:22 While there is another concern with the compliment test, there are only four data
- 06:26 points for each competitor, and we did say we needed at least 30 points.
- 06:30 So the test should be considered invalid until more trials are conducted, and
- 06:34 the times recorded.
- 06:36 Regardless of the nature of your non-normality,
- 06:39 there will be in a hypothesis test that will work for you.
- 06:43 Look at your data, and then decide whether to use the Mood's Median, Kruskal-Wallis,
- 06:48 or the Friedmann test.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.