Locked lesson.
About this lesson
One of the techniques used to measure the normality or non-normality is the Z score. This Z score is often used when comparing data sets and in some of the hypothesis test calculations.
Exercise files
Download this lesson’s related exercise files.
Z score Exercise.docx309.3 KB Z score Exercise Solution.docx
309.5 KB
Quick reference
Z Transformation
A data point within a distribution can be transformed from the physical units to a Z Score. This Z Score converts the data point into units of standard distribution above or below the mean.
When to use
The Z score is used when considering confidence intervals, confidence levels, sample size, and alpha risk on a project. In these cases, it is normally used to determine a percentage of the distribution that is within a range around the mean. It can also be useful for comparing points found in two different distributions.
Instructions
The “Z value” or “Z score” is the transformation of a data point from real-world units into a unit that represents the width of one standard deviation. The score is the number of standard deviations above or below the mean. If the data point is the mean value, the Z score is 0. If the data point was one and one-half standard deviations above the mean the Z score is 1.5. If the data point is two-thirds of a standard deviation below the mean the value is -.667.
The formula is: 𝑍= (x −x̅)/σ
The Z score is often used to determine a percentage of the distribution that is above or below the real-world value represented by a particular Z score. There are some programs that will calculate this percentage. However, on the IASSC exam, you can anticipate that you may need to determine this through a lookup table. Typically a table is provided for one side of the distribution curve. The table transforms the Z score into a percentage of the distribution. When working with a positive value for the Z score (right side of bell-shaped curve), the table provides the percentage of the right side of the total distribution that is below that value. Add 50% to that value to represent the left side of the curve and you have the percentage of the total distribution that is below that value. When working with a negative Z score (left side of bell-shaped curve), use the absolute value of the z score to enter the table and determine the percentage of the left side of the distribution that is above the value represented by the Z score. Add 50% to represent the right side of the distribution and you have the total percentage of the distribution that is above the value represented by the Z score.
Z scores can be calculated with a non-normal distribution. Use the same formula to determine the score. However, you cannot use the table to find percentages since the distribution is not normal and all percentages are based upon a normal curve. An interesting note is that if a distribution is skewed, a plot of the Z scores for that distribution will also be skewed.
Finally, Z scores can also be used to compare distributions. Since the Z score normalizes all data values into a common set of units (standard deviations), the transformed data values can be compared across distributions for similarity or difference in terms of the impact of the score in the distribution.
Hints & tips
- The Z score is used in many hypothesis testing formulas to determine the center portion of the bell-shaped curve. The Z score calculation and Z score table can identify what distribution value to use as the boundaries for 90%, 95%, or 99% of the distribution.
- The same real-world physical value that is found within two different distributions could have different Z scores.
- Two very different real-world physical values may have identical Z scores if the values are from different distributions.
- 00:04 Hi, I'm Ray Sheen.
- 00:05 I want to introduce another concept we'll be using and
- 00:08 that's the Z value and the Z transformation.
- 00:11 So, what is the Z value?
- 00:14 The Z value is a number associated with a particular data point.
- 00:18 But rather than giving you the absolute value of that data point and
- 00:22 the units of the physical system,
- 00:24 the Z value is the place in the process distribution where that data point occurs.
- 00:29 The Z value is measured in standard deviations.
- 00:33 It is the number of standard deviations above or
- 00:35 below the mean value where that point occurs.
- 00:38 In this illustration, the Z value is 1.5 because it is the point that it's
- 00:43 occurring halfway between one standard deviation and
- 00:46 two standard deviations above the mean.
- 00:50 Calculating the Z value is actually quite easy.
- 00:53 Start with the actual value in physical units,
- 00:56 then subtract the value of the mean from that.
- 00:59 Divide the difference by the value of the standard deviation.
- 01:03 So, if the value is greater than the mean, it will be a positive Z value and
- 01:06 beyond the right side of the distribution.
- 01:09 If the value is less than the mean, it will be a negative number and
- 01:13 be on the left side of the distribution.
- 01:15 The Z transformation uses the Z value to determine what percentage of
- 01:20 the data set is above that particular point represented by the Z value and
- 01:25 what percentage is below that point.
- 01:28 When doing that transformation by hand,
- 01:30 I recommend using the Z table shown on this slide.
- 01:34 To use this table, take the absolute value of the Z value that you calculated.
- 01:38 And then using the portion of that Z value that includes the first significant
- 01:43 digit to the right of the decimal point, find the correct row.
- 01:48 Then using the second significant digit to the right of the decimal point,
- 01:52 find the column within that row.
- 01:54 If the Z value is positive, add 50% to that number in the table.
- 01:59 This is the percentage of the data points that are below the Z value you
- 02:03 started with.
- 02:05 And when the Z value is negative, add 50% to the number in the table, and
- 02:09 that is the percentage of the data points that are above that negative Z value.
- 02:14 Of course, if you have access to statistical software or
- 02:17 even an Excel spreadsheet, the computer can calculate this for you.
- 02:20 I've illustrated Z values with a normal curve,
- 02:24 but you can calculate it with a non-normal data also.
- 02:28 However, be careful when using the transformation table.
- 02:31 The table is only appropriate for a normal curve, since with a normal curve,
- 02:35 there are the same number of points above and below the mean value.
- 02:39 While on a non-normal curve, that is not the case.
- 02:42 You calculate the Z value for non-normal data the exact same way using the same
- 02:47 mean value and same standard deviation.
- 02:49 What you cannot easily do is convert that to percentages.
- 02:53 If you think about it,
- 02:54 the mean of a skewed distribution is not near the center of that curve.
- 02:59 The mean will be biased towards the high side of the curve.
- 03:02 So a Z score of 0 for skewed data will not be at the midpoint,
- 03:05 that will be represented by the median value.
- 03:09 One other interesting point, a distribution of Z scores for
- 03:12 skewed data will also be skewed.
- 03:14 The only thing that changed is the unit of measure,
- 03:17 the shape of the curve is still the same.
- 03:20 Finally, there are some interesting things we can do with Z scores.
- 03:24 The obvious is to determine the percentage of scores above or below a certain point.
- 03:29 But there are some other things we can do when comparing two datasets.
- 03:34 The Z value allows us to compare to otherwise dissimilar datasets,
- 03:38 like the old proverb of comparing apples and oranges.
- 03:42 For one thing, the Z value is always dependent upon the distribution or
- 03:46 dataset from which the instance was drawn.
- 03:49 Remember to get the Z value, we must have the dataset mean and
- 03:53 the dataset standard deviation.
- 03:55 For this reason, two instances with an identical physical data value could have
- 04:00 a very different Z score if they were drawn from two different distributions.
- 04:06 For instance, a weight of 150 pounds would have a different Z score
- 04:10 if the dataset was 15 year old boys or if the dataset was 25 year old men.
- 04:15 And by the same token, two different physical values may have the same Z score
- 04:20 if they were pulled from two different datasets.
- 04:23 A Z score of 2.0 may be 150 pounds for a junior high football team, but
- 04:28 would be closer to 250 pounds for a college or professional football team.
- 04:34 Since Z scores normalize the distribution and put everything into the same units,
- 04:39 which is standard deviations above or below the mean, it can provide a way to
- 04:43 do comparison between two otherwise very different datasets.
- 04:47 Z score and Z transformation provide a means for
- 04:51 us to normalize our data with a statistical set of units.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.