Locked lesson.
About this lesson
The One-Sample and Two-Sample Test of Proportions are used with discrete data. These tests determine whether the percentage of a particular attribute being studied is similar to or different from the selected target value. These tests are illustrated using both Excel and Minitab.
Exercise files
Download this lesson’s related exercise files.
Test of Proportions Exercise.xlsx11.2 KB Test of Proportions Exercise Solution.docx
221.2 KB
Quick reference
Test of Proportions
The Test of Proportions is for data sets with discrete data. The tests compare the percentage of a particular attribute found in the data against either a known target or the percentage of that attribute in another data set.
When to use
Use the Test of proportions with discrete data, such as yes/no, true/false, or on/off. It is often used to determine if two data sets are different; either to discover an underlying root cause or as a before-after test during the Improve phase.
Instructions
The Test of Proportions is a simple test to determine if the percentage of an attribute in a data set is statistically different from a target percentage or from another data set. It is used both when determining cause and effect relationships and for determining the benefit of a solution implementation during the Improve phase.
Normally the Null hypothesis is:
P = Target for One Sample Test of Proportions and P1 = P2 for Two Sample Test of Proportions where “P” is the proportion of the attribute found in the sample.
Normally the Alternative hypothesis is:
P ≠ Target for One Sample Test of Proportions and P1 ≠ P2 for Two Sample Test of Proportions. In some cases, a greater than or less than operator is used in the Alternative Hypothesis.
The One-Sample Test of Proportions tests the data set percentage against a known or target percentage. The Two-Sample Test of Proportions tests the percentages of each sample against each other.
Excel:
- Excel cannot perform the One-Sample Test of Proportions
- Excel requires several steps to perform the Two-Sample Test of Proportions.
- Ensure your discrete data is converted to integers – on=1, off=0
- Use the VAR function to find the variance for each of the data sets
- In the Data Analysis Menu, use the Z Test: Two Samples for Means” function.
- Enter the data ranges and the variance values.
- Excel will calculate both a one-sided tail and two-sided tail P Value. The one-sided tail is for the Hypothesis test of greater than or less than. The two-side test is for the Hypothesis test of equal to or not equal to.
Minitab:
- Minitab calculates the One-Sample Test of Proportions
- Stat > Basic Statistics > 1 Proportion
- Enter the column with your data and enter your target percentage.
- Click on the Hypothesis test box
- Select the Option button to change the Alternative Hypothesis to a greater than or less than condition.
- Minitab calculates the Two-Sample
- Stat > Basic Statistics > 2 Proportion
- Select the format of your data (all data in one column or in two columns)
- Select your data columns
- Select the Option button to change the Alternative Hypothesis to a greater than or less than condition.
Hints & tips
- Minitab and Excel calculate slightly different P Values – but the difference is very small.
- The data values must be in integer format for Excel (change True/False to 1/0), but the data can still be text data in Minitab.
- The difference between one-sided tail and two-sided tail is based on the Bell-shaped Curve. When the Hypothesis test is “equal to” or “not equal to” the test must consider both the upper portion of the curve and the lower portion of the curve. When the Hypothesis test is “greater than” or “less than,” only one side of the Bell-shaped curve must be checked.
- 00:04 Hi, I'm Ray Sheen.
- 00:05 We've discussed hypothesis testing when both the independent and
- 00:09 dependent variables are continuous.
- 00:11 Now, it's time to look at the case when both are discrete.
- 00:15 We'll start with the test of proportions.
- 00:19 >> So let's look at our hypothesis testing decision tree.
- 00:22 We are considering the case when the data is discrete, both x and y values.
- 00:27 That means that we will be working with counts and instances, not measurements.
- 00:32 And in this lesson, we'll discuss the one-sample test of proportions and
- 00:36 the two-sample test of proportions.
- 00:39 Before we go into how to run these tests,
- 00:42 let's first explain what a test of proportions does.
- 00:45 Test of proportions is used with discrete or attribute data.
- 00:49 So the data will be counts of occurrences or true/false and on/off types of data.
- 00:56 The test of proportions will compare the percentage of items in a sample
- 01:00 that contains a particular attribute to another percentage,
- 01:05 either from another sample or from a baseline value.
- 01:08 The question is to determine if the proportions or percentages are the same.
- 01:13 Or more precisely, if there are differences in the proportion,
- 01:17 whether those differences are statistically significant.
- 01:20 This test is not testing means or standard deviation,
- 01:24 it's testing the percentage of counts in the category of interest.
- 01:29 The one sample test will compare the proportions to a target value that has
- 01:34 been set based upon historic measurements or
- 01:37 the known value of another large population.
- 01:40 The null hypothesis will be that the sample proportion equals the target value,
- 01:45 there is nothing unusual in the sample.
- 01:48 The alternative hypothesis will normally be that the sample
- 01:52 proportion value is larger or smaller than the target value.
- 01:57 The two sample test of proportion is similar in the comparison,
- 02:01 except that the target value is replaced with the proportion or
- 02:05 percentage in the second sample.
- 02:07 This test is normally used to determine if two samples are different or
- 02:11 if they're from the same population.
- 02:13 It's very common to use the tests with complaint or
- 02:17 defect data in a before and after comparison.
- 02:20 The two samples being before an improvement is introduced and
- 02:24 after it has been introduced.
- 02:26 The null hypothesis is always the proportion of the two samples is equal,
- 02:30 which means that subtracting one from the other results in zero.
- 02:34 The alternative hypothesis is that the proportions are not equal.
- 02:38 This set of hypothesis may be used in the analysis phase to discover differences
- 02:43 that will lead to an understanding of the problem.
- 02:45 If doing the before and after type of comparison,
- 02:48 this is normally done in the improve phase.
- 02:50 And we often want to show that the improved proportion is higher or
- 02:54 lower proportion than the original proportion.
- 02:57 In that case, the null hypothesis is that the two proportions are equal to or
- 03:02 less than or greater than the baseline of proportion.
- 03:06 And the alternative hypothesis is that the new proportion is higher or
- 03:10 lower than the baseline proportion.
- 03:12 And of course, the test to determine these differences is statistically significant.
- 03:18 Let's consider how we do one sample test of proportions.
- 03:22 For our example, we will consider the percentage of applicants to college for
- 03:26 the current college year that are accepted.
- 03:28 Historically, the college has been accepting 52% of applicants,
- 03:33 this year, it was 57%, is the change significant?
- 03:36 The null hypothesis would be that this year's percentage of applicants that
- 03:40 were accepted is equal to the historical percentage.
- 03:43 The alternative would be that this year's percentage is higher than
- 03:48 the historical percentage.
- 03:50 Excel does not perform this test.
- 03:52 In Minitab, select the Stat pulldown menu,
- 03:55 then select the Basic Statistics, and next, select 1 Proportion.
- 03:59 That will bring up this panel.
- 04:02 Now, select the column where the data is located.
- 04:05 We need to have the raw data so
- 04:06 that Minitab can also have a count of the sample size.
- 04:10 Recall that the sample size will impact the confidence interval.
- 04:14 Enter your target statistic in the window labeled Hypothesized proportion.
- 04:18 If you select the Options button, you can determine whether you want the test to be
- 04:23 for a relationship of equal, or greater than, or less than.
- 04:27 Then click OK and the results will be found in the session window of Minitab,
- 04:32 with a P Value that you can use to decide whether to reject or
- 04:36 fail to reject the null hypothesis.
- 04:38 Now, let's look at the two sample test of proportions.
- 04:42 We're comparing the proportions of an attribute between two samples of data.
- 04:46 In this illustration, the comparison is the rate of on-time
- 04:51 submittals of tax returns between 2016 and 2017.
- 04:55 Excel does not provide this as a standalone function, but
- 04:59 we can still do this analysis in Excel, it just will take a few steps.
- 05:03 So there are several functions in Excel that when we string them together,
- 05:08 can do this analysis.
- 05:10 One caution, make sure your data is in integer format because Excel will
- 05:14 be doing calculations with it.
- 05:16 Start with the standard VAR or variance function for each data set.
- 05:20 Be sure to record those values for future use.
- 05:24 Next, go to the data analysis pull down menu and
- 05:27 select the Z-test two samples for means function.
- 05:31 Enter the data range and the previously calculated variance for
- 05:34 each sample in the form that's displayed.
- 05:37 Also, enter any difference if there is one that you're expecting.
- 05:42 I normally set this to zero.
- 05:44 Excel will calculate a P value for both the greater than and
- 05:48 the less than conditions.
- 05:50 Minitab is simpler, select Stat, then Basic Statistics, and then 2 Proportions.
- 05:56 Select the format of your data, either all data is in one column with an adjacent
- 06:01 column that specifies which sample is associated with the data value, or
- 06:06 the data is in different columns with the sample as the column title.
- 06:11 Then select the appropriate data columns.
- 06:13 And finally, go to the Options button to change the confidence level or
- 06:17 to specify a relationship.
- 06:19 The default is to check if they are equal,
- 06:21 but you can check if one is greater than or less than the other.
- 06:24 The result is found in the session window with an associated P value.
- 06:29 Excel and Minitab will give slightly different results for P value.
- 06:34 But I find that the differences are out at the third or
- 06:37 fourth significant digit and are unlikely to impact the decision of whether or
- 06:42 not to reject the null hypothesis.
- 06:44 The last thing to discuss is the formulas involved.
- 06:47 This is for your reference if you're taking the IASSC exam.
- 06:51 Since Excel does not provide the test of proportions function,
- 06:55 you can do the analysis manually.
- 06:57 To do this, you will need the percentage in each sample and
- 07:01 the number of items in the sample.
- 07:03 These are the formulas for the one sample and two sample tests of proportions.
- 07:07 The p terms are the percentages,
- 07:09 and the n terms are the number of points in the sample.
- 07:12 The formula is a Z value, and that can be used to determine whether it is
- 07:17 an acceptable level depending upon the value of alpha selected.
- 07:22 >> The one sample and two sample test of proportions are quick and
- 07:26 easy tests that help us to understand the characteristics of the data sets that
- 07:30 are made up of discrete data.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.