Back to course

Mood's Median, Kruskal-Wallis, and Friedman

Retired course

This course has been retired and is no longer supported.

About this lesson

These three tests are for multiple samples of non-normal data. Each test has its strengths and weaknesses. The appropriate test will depend upon what is known, or not known, about the data in the samples. The Minitab interface to accomplish each of these tests is similar. This lesson will explain the differences and show how to conduct the test and read the results.

Exercise files

Download this lesson’s related exercise files.

Mood's Median, Kruskal-Wallis, and Friedman.xlsx
10.6 KB Mood's Median, Kruskal-Wallis, and Friedman - Solution.docx
231.1 KB

Mood's Median, Kruskal-Wallis, and Friedman

When multiple non-normal data samples are compared in a hypothesis test, there are several potential tests that can be used. The Mood’s Median, Kruskal-Wallis, and Friedman tests are typical tests used and each is best suited to different characteristics of the data.

When to use

Many Lean Six Sigma projects requiring hypothesis tests are based upon non-normal data sets. The Mood’s Median Test, Kruskal-Wallis Test, and Friedman Test are used with multiple data sets. The specific test to be used will depend upon the characteristics of the data.

Instructions

Mood’s Median Test

The Mood’s Median Test is appropriate for use with multiple data samples whose non-normal data sets have a similar shape – such as skewed left, skewed right, or bathtub. The test will work with multiple data samples. This test is particularly robust with respect to outliers. The test cannot be accomplished with Excel.

Minitab:
1. All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
2. Stat > Nonparametrics > Mood’s Mediam Test
3. Select the data column for the Response field
4. Select the data identified column for the Factor field

Kruskal-Wallis Test

The Kruskal-Wallis Test is appropriate for use with multiple non-normal data samples. This test is essentially an ANOVA test for non-normal data. The data items should be continuous (not discrete). The data samples do not need to have similar shapes as with the Mood’s Median Test. This test is sensitive to outliers. This test cannot be accomplished with Excel.

Minitab:
1. All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
2. Stat > Nonparametrics > Kruskal-Wallis
3. Select the data column for the Response field
4. Select the data identified column for the Factor field

Friedman Test

The Friedman Test is the most complex of the non-normal data hypothesis tests that we use with multiple data samples. The Friedman Test works with large blocks of data. It essentially compares the data within the blocks and then between the blocks. In this regard it is a hybrid of the Paired T Test and an ANOVA or Kruskal-Wallis Test. The minimum sample size you should use in the Friedman Test is 30 data items. This test cannot be accomplished with Excel.

Minitab:
1. All the data must be combined into one column. Use the Data > Stack > Column command to merge multiple data samples into one column.
2. Stat > Nonparametrics > Friedman
3. Select the data column for the Response field
4. Select the data identified column for the Factor field

Hints & tips

Stacking data in one column is very easy in Minitab using the stack command. I load my data into Minitab with a separate column for each sample then stack once everything is in. With the Friedman test, the stacked column can easily have hundreds of entries. Loading the data first by sample columns allows me to easily find and fix data problems.
Run a normality check to see if your data is non-normal.
Create a histogram of the data in each sample to see the shape of the data set.

00:04 Hello, I'm Ray Sheen.
00:06 Non-normal data can sometimes get really.
00:10 Fortunately, we have three different tests that we can use when we have various
00:15 non-normal data.
00:16 Those are the Mood's Median Test, the Kruskal-Walis Test, and
00:20 the Friendmann Test.
00:23 Once more for the Hypothesis Test Decision Tree.
00:26 We're still working with non-normal data, but now we want to consider the test for
00:30 the cases where there are two or more samples.
00:33 The first test is Mood's Median Test.
00:37 Mood's Median considers the quality of the medians form two or more data samples.
00:41 The Mood's Median is a quick and simple test, but
00:44 also has a few assumptions that must be met.
00:46 Mood's Median is most effective when data samples are independent, and
00:50 their distributions are roughly the same shape.
00:54 One distinctive advantage of Mood's Median Test is that it is robust with respect to
00:58 outliers in the data.
00:59 The hypothesis are straight forward.
01:01 The null hypothesis states that,
01:03 the medians of all the sample sets of data are statistically equal.
01:07 The alternative hypothesis says that, the medians are not equal.
01:11 Since there are multiple data sample sets, there's no designation for larger than or
01:15 smaller than, that we would use when there are only two data samples.
01:20 So let's look at setting this up on Minitab.
01:22 Use the Stat pulldown menu.
01:24 Select Nonparametrics, and then Mood's Median, and this panel will appear.
01:29 Mood's Median requires that all data from all the samples be in the same column.
01:34 However, there is a second column that has the sample designator
01:37 that is normally adjacent to the data column.
01:40 The easy way to create this column is just to stack the data from each of the samples
01:44 one above the other.
01:45 Meaning, to actually do this quickly for
01:47 you using the stack command under the data pull down menu.
01:51 So enter the column name with the data in the response field and
01:55 the column name with data sample identifier in the factor entry box.
02:01 Now, let's look at the Kruskal-Wallis.
02:03 Think of Kruskal-Wallis like it is an advanced version of ANOVA
02:07 that works well with non-normal data.
02:10 There are some key data sample set characteristics
02:13 that must be met when using Kruskal-Wallis.
02:15 Again, the data samples should be independent.
02:18 The data should be continuous data as compared with discrete data.
02:21 However, the data sample sets do not need to have the same shape or distribution.
02:26 So in that regard, it has an advantage over Mood's median, but it is sensitive to
02:31 outliers so in that characteristic, Mood's median has the advantage.
02:35 The hypothesis statements are the same as we had with Mood's median test.
02:39 The null hypothesis state that, the medians are statistically equal and
02:42 the alternative hypothesis states that they are not equal to each other.
02:46 During the Kruskal-Wallis test in Minitab is very similar to that of
02:50 the Moods Median Test.
02:51 Go to the Stat pulldown menu, select Nonparametrics, and
02:54 then select Kruskal-Wallis and this panel will pop up.
02:58 And just like with Mood's median, all the data must be in the same column and
03:02 the adjacent column should have the factor
03:05 that is being used to designate the sample subsets.
03:08 Each of these columns are entered into your appropriate field in minitab.
03:12 And now for the Friedman test.
03:14 The Friedman test is a bit weird,
03:16 it works with blocks of paired non normal data groups.
03:19 Think of it as a hybrid of ANOVA, paired T-Test and Kruskal-Wallis.
03:24 With all of its doing, Friedman need samples with many data points,
03:29 a minimum of 30 and more is better.
03:31 The Null Hypothesis and
03:33 Alternative Hypothesis are the same with Moods Median and Kruskal-Wallis.
03:37 The null is that the medians are equal, and
03:39 the alternative is that they are not statistically equal.
03:43 Minitab is a little bit different.
03:45 You still start the same way, go to the Stat pulldown menu, select Nonparametrics,
03:49 and then select Friedman, and this is the panel you get.
03:53 Once again, all the data is in the same column, but
03:56 now you have one column with the sample category data designator or
04:01 treatment, and one with the block data designator.
04:04 You can probably guess that this test is used primarily with
04:07 clinical test of medical treatments and pharmaceuticals.
04:10 But you may have a need for it in a Lean Six Sigma project.
04:14 If working across multiple locations or
04:16 product lines on your project, just designate the appropriate columns.
04:21 So let's compare the results of these three tests.
04:23 In this case,
04:24 I'm using the time trials from the luge event in the 2014 Winter Olympics.
04:29 There are four trials and the data is not normally distributed.
04:33 Mood's Median shows that the median of each of the four trials
04:37 along with the confidence internal on a small scale in the session window.
04:41 The P value was 0.001.
04:44 Reject the null hypothesis.
04:46 Each trial was definitely different.
04:47 Kruskal-Wallis shows the median values and average rank.
04:52 The ranking clearly shows that the times and samples one and
04:55 two are higher than with samples three and four.
04:59 Again, the P value is very low, in this case it's zero, so
05:02 reject the null hypothesis.
05:04 The Friedman Test also shows the median.
05:07 Interestingly, and I don't know why, it has an additional significant digit
05:11 showing In this test result the sum of ranks which is a ranks measure adjusted
05:16 for the Friedman Block characteristic still shows that time trials one and
05:20 two are much different than three and four.
05:23 And again our P value is 0.
05:25 So reject the null, but let me point out
05:28 the Friedman Test did the time versus trial and block by the competitor.
05:32 In other words, the dominant characteristic is the trial and
05:35 competitor blocks were used to minimize the effect of good teams versus bad teams.
05:40 But now in this result, I switched the trial and
05:43 the competitor between the treatment and block parameters.
05:46 We have a very different result.
05:47 The data is the same, but in this case there will be a median for
05:52 each competitor.
05:53 I just left some off the sheet for readability, and there's a rank for
05:57 each competitor, notice that the P value is 0.572, so
06:01 we will fail to reject the null hypothesis in this case, but that's
06:05 because the hypothesis between these two runs of Freedman Test is different.
06:10 In the first case the null hypothesis is at the median time in each trial is
06:15 the same.
06:16 In the second case the null hypothesis is st the median time for
06:20 each competitor was the same..
06:22 While there is another concern with the compliment test, there are only four data
06:26 points for each competitor, and we did say we needed at least 30 points.
06:30 So the test should be considered invalid until more trials are conducted, and
06:34 the times recorded.
06:36 Regardless of the nature of your non-normality,
06:39 there will be in a hypothesis test that will work for you.
06:43 Look at your data, and then decide whether to use the Mood's Median, Kruskal-Wallis,
06:48 or the Friedmann test.

Lesson notes are only available for subscribers.

PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.

Mood's Median, Kruskal-Wallis, and Friedman

About this lesson

Exercise files

Quick reference

Mood's Median, Kruskal-Wallis, and Friedman

When to use

Instructions

Mood’s Median Test

Kruskal-Wallis Test

Hints & tips