Locked lesson.
About this lesson
Various graphical analysis techniques can help illustrate comparisons, relationships, distributions, or compositions. Selecting the correct graphical technique can illustrate what is significant, and the wrong technique can lead to confusion.
Exercise files
Download this lesson’s related exercise files.
Advanced Visual Analysis Exercise.docx61 KB Advanced Visual Analysis Solution.docx
59.6 KB
Quick reference
Advanced Visual Analysis
Various graphical analysis techniques are particularly good for illustrating comparison, relationships, distribution, or composition. Selecting the correct graphical technique will illustrate what is significant, and the wrong technique can lead to confusion.
When to use
Graphical techniques are used during the Analyze phase. They are often used to uncover potential problems that are then confirmed with statistical analysis.
Instructions
Different types of graphs can be used for visual analysis depending on the question or hypothesis that is being asked. Using a graphical analysis technique provides a visualization of the data that can quickly lead to conclusions about the problem.
When comparing datasets looking for the significance of a parameter use the vertical bar chart, pie chart, line graph, area chart, or horizontal bar chart. The Pareto chart, which is a special case of the vertical bar chart, is particularly useful. The Pie Chart is normally used in families of pie charts, so either the charts will show that an item is significant in some instances (pies) and not in others, or that it is significant in many different instances (pies). Either of those may be very important information to prove or disprove a hypothesis.
Another set of graphical techniques exposes relationships. The scatter diagram shows the relationship between two parameters of a datapoint and bubble chart shows the relationship between three parameters of a datapoint. The flow chart shows the relationship between activities rather than between data points and is the most common graphical technique used with Lean analysis.
The graphical techniques that illustrate distribution will provide insight into the underlying characteristics of the dataset. The probability density function and box plot show both the central tendency and the extremes within a dataset. The heat map shows the distribution of a factor across a range of conditions and the shaded map shows the distribution of a parameter across a geographical region. In both of those maps, hotspots are of great interest.
There are also several charts that indicate composition. The most common in data analysis is the stacked bar chart that shows the magnitude of an effect as it accumulates within a category. The waterfall chart can be used to show accumulation when some of the items are positive and some are negative. The Gantt chart shows the accumulation of effort over time and is the most common chart for tracking the effort on a Lean Six Sigma project.
Hints & tips
- Once the data is captured in a database or table, it is easy to create charts and graphs in both Excel and Minitab. It only takes a few mouse clicks, so visualize the data first before jumping into statistical analysis.
- Graphical or visual analysis is particularly well suited to single-factor problems and special cause problems. That is why some of the basic problem-solving methodologies rely solely on visual analysis.
- First, determine the question you are trying to answer, then pick a chart from the appropriate category. For instance, if you want significance you need a comparison chart. If you want to understand relationships you would use different charts than if you wanted to understand what were the elements that compose a category.
- When you first introduce a new category of maps to an organization, you may need to spend a few minutes explaining what is being displayed and how to use the information. If an organization has only used bar charts, a box plot is very confusing.
- 00:04 Hello, I'm Ray Sheen.
- 00:06 There are many different visual analysis techniques.
- 00:08 Let's look at the main categories and some examples for each.
- 00:12 When doing a visual or graphical analysis,
- 00:14 you first need to decide what question you're trying to answer.
- 00:18 Are you doing a comparison between factors to find out which is important.
- 00:23 Are you looking for a relationship between factors?
- 00:26 Are you investigating the composition of a process result?
- 00:30 Are you looking at a distribution within the result?
- 00:34 In each case, there are multiple visualizations to select from.
- 00:38 Let's consider each category.
- 00:40 We'll start with the comparison.
- 00:42 Normally, what you're looking to find out is, which factor is most important?
- 00:47 These graphs usually show all the factors measured in common units and
- 00:52 plotted to show what is the biggest and therefore the most significant.
- 00:56 This helps the team to focus in depth on the major factors so
- 01:00 as to find a solution for that factor.
- 01:03 Typical charts for this type of analysis are the vertical and
- 01:07 horizontal bar chart, the pie chart, and the Pareto diagram,
- 01:11 which is a special case of the histogram or vertical bar chart.
- 01:14 These charts can also be used to show when something is outside the normal condition.
- 01:19 By showing all the instances of the factor, the unusual one jumps out.
- 01:23 Outliers can be very important instances to understand special cause problems.
- 01:28 If you can find the special cause and find a way to correct or
- 01:32 prevent it from occurring, you can reanalyze the data without that outlier to
- 01:37 determine if the dataset is now acceptable to the customer.
- 01:40 Run charts are a great tool for finding when something would happen, and
- 01:44 the scatter diagram and box plots can also be used for this purpose.
- 01:48 Although, frankly, we primarily use them for another purpose.
- 01:52 Next, let's look at the category of correlation.
- 01:56 In this category, the charts will show us if two or
- 01:59 more factors are related to each other.
- 02:01 The relationship could be positive one, meaning a one set gets larger,
- 02:05 the other set gets larger as we see here on the left, or a negative one,
- 02:10 which would be if one set gets larger or the other set gets smaller.
- 02:14 In fact, it can also show that there's no correlation as we see on the slide on
- 02:18 the right.
- 02:18 These relationships are often due to cause and effect between the two factors.
- 02:23 However, a caution here,
- 02:24 sometimes the factors have an apparent relationship to each other, but
- 02:29 actually, both are related to a different factor that is not being graphed.
- 02:33 The most common chart in this category is the scatter diagram, as we're showing.
- 02:38 A variation of the scatter diagram is the bubble chart, which adds a third
- 02:43 dimension to the scatter diagram by changing the size of the dot to a bubble.
- 02:48 And letting the bubble diameter represent another factor,
- 02:52 in particular it's usually the Y factor in the Y =F(x) equation.
- 02:57 The third chart in this category is the process flow chart,
- 03:01 that shows the relationship between steps in a process, the sequence.
- 03:06 The next category of charts is showing distributions.
- 03:10 In this case, we're usually trying to find the unexpected or
- 03:13 different points within the distribution.
- 03:15 One of the most common types of differences is to determine if two or
- 03:19 more of the subsets of data are different from each other or
- 03:23 are they essentially the same thing?
- 03:25 This can help us to identify whether a factor is truly significant.
- 03:29 The most common charts for this purpose are the box plots and
- 03:33 the probability density function.
- 03:35 Both of these create a picture of the data distribution.
- 03:39 The other is to find a different point varied somewhere within a dataset.
- 03:44 This might be a sharp inflection point or an isolated point whose difference
- 03:49 becomes lost in the statistical analysis of the data.
- 03:52 But when visualized, it is now very obvious.
- 03:55 The heat map is the most common technique for this,
- 03:59 although a line graph can show us when the timing might occur.
- 04:03 The final category is composition.
- 04:05 In this case, we're looking to see,
- 04:08 what are the component parts that make up the end result of the product or process?
- 04:13 The most common visual analysis is the stack component chart.
- 04:17 In this case, the columns of data are different discrete categories,
- 04:21 such as months or locations or problem instances.
- 04:24 And the components of the column are the various elements.
- 04:28 Use the same color in each column for a particular type of component.
- 04:33 It makes it easier to see how that component varies based upon
- 04:37 the overall size of the column.
- 04:39 Now, sometimes though, some of the components will have a negative value.
- 04:44 In that case,
- 04:45 you can't use the stack bar chart because there's no way to show a negative number.
- 04:50 So instead, we use the accumulated bar chart,
- 04:53 which essentially has just one bar per chart.
- 04:57 But rather than stacking them on top of each other, the variable components and
- 05:02 the bars are separated horizontally across the chart.
- 05:06 Each bar's base point starts where the previous bar's top ended.
- 05:10 In this way, if the bar is a negative bar,
- 05:13 the next one would actually be starting lower than the preceding bar.
- 05:18 Finally, if trying to show the components of effort, use a Gantt chart and
- 05:22 show each of the effort components as a bar on the Gantt chart,
- 05:26 with the horizontal axis showing when those efforts started and stopped.
- 05:31 The right visualization can reveal much about a data set.
- 05:35 Determine the question you want to have answered,
- 05:40 then pick the correct chart.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.