Locked lesson.
About this lesson
When the process or problem data set has multiple characteristics, there are a set of graphing techniques that can show these effects. Although more complex than the basic techniques, they are easy to use and create a picture of the data set.
Exercise files
Download this lesson’s related exercise files.
Graphing Complex Data Exercise.docx69 KB Graphing Complex Data Exercise Solution.docx
76 KB
Quick reference
Graphing of Complex Data
When the process or problem data set has multiple characteristics, there is a set of graphing techniques that can show these effects. Although more complex than the basic techniques, they are easy to use and create a picture of the data set.
When to use
Graphical analysis is an excellent way to visualize patterns and key insights from data. Graphical analysis is also an excellent way to communicate data to the Lean Six Sigma team and stakeholders. These techniques should be used whenever discussing data with team members or stakeholders.
Instructions
Graphical analysis creates a picture of the data which helps to put the data into context. Graphical analysis can display a great deal of data on one graph and the patterns in the data can reveal problems and correlation between process parameters and factors. Three graphs are often used with multi-variate data.
Horizontal Bar Chart
The horizontal bar chart is an excellent technique with attribute data. It is similar to the vertical bar charts but with the bars running horizontally. The categories of the data are shown on the vertical axis and are represented by rows on the chart. The horizontal axis is the count of instances for the categories – not a time scale. The rows of data are normally sorted so that the longest data bar is at the top and the shortest at the bottom. Unlike the vertical bar chart, there is no limit to the number of rows shown on the chart.
Pie Chart
The pie chart is a graphical illustration of relative percentages of attributes or categories. The width of each slice of the pie shows the percentage associated with that attribute value. Pie charts are frequently used in comparison such as comparing different pies for different locations or different products. I have also used them for comparing before and after conditions in the product or process being improved.
Box Plots
The box plot chart is normally used to present a family of box plots. Each box plot represents a data set such as for multiple locations, multiple products, or multiple customers. The box plot requires variable data and the different data set is normally separated based on attribute data. The data values in the plot are sorted from largest to smallest. Five points in the data set are used to create the box plot. The minimum value, the maximum value, the midpoint (median value), the value at the 25% point in the sorted data and the value at the 75% point in the sorted data. A horizontal line is created that is the length of the spread of the data – one end is the minimum and one the maximum. A box is placed over the line that is located so that the 25% point is one side and the 75% point is the other side. Finally, the median value is shown with a line through the interior of the box. In some cases, a few points may be shown as outliers with an “x” that extends past the endpoints of the box plot. To identify the outliers, determine the magnitude of the spread from the 25% point to the 75% point. Multiply that value by 1.5. Any points that are more than this value below the 25% point or above the 75% point are outliers. In that case, show the points with an “x” and shift the endpoint of the horizontal line to the first data point that is within the calculated range. While I have described box plots that are oriented horizontally, they can also be oriented vertically.
Data Tables
Multi-variate data items can also be shown in a table. The table is structured so that each row is a data item. Each column represents one of the categories of data – either variable or attribute. Normally, the table will be sorted from highest to lowest value using data found in one of the columns. Large tables of numbers can be difficult to read, so the units for numeric data should be chosen so that most data values have two or three digits.
Hints & tips
- When creating a horizontal bar chart with a large number of rows (40 or 50) it is often helpful to use multiple colors for the bars. However, create a pattern of colors and maintain that pattern so that people won’t think that the color is also a data attribute.
- If the categories on a bar chart are “buckets” or a range on a sliding scale, the graph may look different depending on the width of the “bucket.” Try various widths to see which provides the most in sight.
- Don’t use time-based data items with a horizontal bar chart. People try to make the horizontal axis a timeline instead of a count of instances.
- Typically limit the pie chart to about six or seven slices otherwise some slices are so small they are unreadable.
- Box plots provide a visual cue for whether data sets are similar. If you are not sure about similarity, you will need to do a statistical analysis.
- Typically, box plots are vertical with discrete “x’s” and continuous “Y”, and horizontal with the discrete “Y” and continuous “x’s.”
- If your graph has data points that sit directly on top of each other so that the top one hides all below it, apply a jitter function to the output so as to create clusters of data points.
- Graphs are supposed to help us understand the data. Don’t let your graphs become so complex that they are confusing to read.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.