Locked lesson.
About this lesson
This lesson exposes a very important pattern for wrangling data. Once you can get it into a single column, hit it with this recipe and it will land nicely into a table.
Exercise files
Download this lesson’s related exercise files. You can download source data files for the course from the resources section of your Lessons page.
Pivoting Stacked Data.xlsx19.5 KB Pivoting Stacked Data - Completed.xlsx
25.9 KB
Quick reference
Pivoting Stacked Data
An overview of pivoting data that is stacked in one column.
When to use
When you have a column that has data stacked in a consistent repeating pattern of x rows per transaction and need to transpose each block and stack them into a tabular format.
Instructions
Getting started
- Create a new query to bring your data to Power Query
- If a header row exists in the data, promote it to be the header row
Creating the transaction ID column
- Add Column --> Index Column --> From 0 (zero)
- Highlight Index column --> Add Column --> Standard --> Integer-Divide
- Enter the “magic” number (the number in the index column that corresponds with the first line of the second instance of the row pattern) --> OK
Creating the transaction line ID column
- Highlight Index column --> Transform --> Standard --> Modulo
Pivoting the data
- Select the Index column --> Transform --> Pivot Column
- Select the column to preserve in the body area of the pivoted date (the values you want to see)
- Select Advanced Options --> Don’t Aggregate --> OK
- The data should not be in an unpivoted format
Finishing the query
- Remove the Integer-Divide (Transaction ID) column
- Apply any other transformations as required
- Set data type for each column
- Rename the query
- Close & Load the query
Hints & tips
- Remember, the “magic” number is the first line of the second transaction
- The Integer-Divide function calculates the number of times the “magic” number can be extracted from the value in the Index column
- 00:04 In this video, we're gonna go through a very important Power Query pattern for
- 00:09 pivoting stacked data.
- 00:10 And the reason this is important is because it can actually be used elsewhere
- 00:13 once you see this pattern recur.
- 00:16 We're gonna start by going into create a new query to grab from file, from text or
- 00:21 CSV.
- 00:23 And we're gonna drill down into our GoSkills example files
- 00:26 into the Pivoting Stack Data folder, and we're gonna grab Visa Transactions.
- 00:32 Now like most of the data in this course, these are based on near world samples.
- 00:36 This is based on a real world example that was once sent to me.
- 00:39 And it's Visa transactions where they've been extracted and
- 00:42 it's all in a single column.
- 00:44 So let's click Edit because obviously that's not gonna work so well for us.
- 00:47 Once we get into power query ,the
- 00:51 first thing we need to do is pop up this header row.
- 00:54 So we could do that here under use first rows header,
- 00:57 we can also find a bunch of commands under this origin on the top left corner.
- 01:01 So that's the one I'm gonna use this time just to be different.
- 01:03 Does the same thing different place.
- 01:06 Now at this point what I want you to recognize is that we have a repeating
- 01:10 pattern that goes from the beginning all the way to the end of this file.
- 01:14 It goes date, vendor, amount, blank, date, vendor,
- 01:18 amount, blank, and there are no deviations in this file whatsoever.
- 01:24 It's a four line block for each transaction.
- 01:29 We're going to unpivot this so that we can have it working for
- 01:33 a Pivot Table as a data source.
- 01:34 And the secret to this is actually a recipe that you can use over and
- 01:38 over again.
- 01:40 And it works like this, we start by adding a column.
- 01:45 And that column is a very specific column, it's an index column.
- 01:49 You'll notice from the drop down, that there are several options,
- 01:52 in order to make this pattern work mathematically you must choose zero.
- 01:57 That will give you what is a line item for each individual line in the entire file.
- 02:03 It increases by one, that's an index column.
- 02:06 What we're now gonna do is grab the index column,
- 02:10 we're gonna go to Add Column> Standard and we're gonna choose a divide
- 02:14 integer and at this It's gonna ask us for a number.
- 02:19 The magic number is from the index column,
- 02:23 it is the first line of the second transaction.
- 02:27 So in this case, when we look down here we can see that it's this one.
- 02:31 It's not row five, it's an index number of four,
- 02:34 that's the first time we see the second date.
- 02:38 So well say, four, this is the magic number for this pattern and
- 02:41 we're gonna use it again.
- 02:43 So don't forget it now, always the first line of the second transaction.
- 02:50 So we'll say OK.
- 02:51 And what you'll see is we get a new column and
- 02:53 you'll see the first four rows are zero, the second four rows are one.
- 02:57 So this is actually your transaction ID number.
- 03:01 We're then gonna take the index column.
- 03:04 We can go to Transform not Add Columns.
- 03:08 This time, this is gonna be on the Transform tab.
- 03:09 You'll notice the same commands are in most of these places.
- 03:12 One of them transforms the existing column, the other one
- 03:15 keeps the existing column but adds a new column with the transformation.
- 03:19 We want in this case to transform the existing one.
- 03:22 And this time,
- 03:23 we're gonna use another command that you've probably never used called Modulo.
- 03:27 Modulo is kind of the opposite of the integer divide.
- 03:31 The integer divide looks through and says here, you told me the magic number is 4,
- 03:35 I'm gonna try and take number 4 into this as many times,
- 03:39 give me how many times will go in as a whole number.
- 03:42 Modulo we use the same value that we used for the previous line so
- 03:46 the magic number is 4.
- 03:47 And what Modulo does is it says, give me the remainder instead.
- 03:53 So, integer division gives us a transaction ID number.
- 03:57 The converted index column using a modulo gives us a transaction line ID number.
- 04:04 Okay, well that's kind of interesting, but what the heck do I do with that?
- 04:08 Well, here's the secret, we grab our new modulo column here on index.
- 04:13 And we're going to pivot it.
- 04:15 So we'll choose Pivot, and it says you wanna use this as your column name,
- 04:20 that's great but we need to go to Advanced options, change this to Don't Aggregate.
- 04:27 And now, be ready for a magic moment because we're gonna click OK and
- 04:32 everything is virtually unpivoted for us right away.
- 04:36 Now we no longer needs the integer division column, but it is required for
- 04:40 the pattern upfront.
- 04:41 Now that it's done we can right click and remove it.
- 04:45 At this point, it's a good time to scan the rest of the columns and
- 04:48 see which blank ones might be there that we can get rid of.
- 04:51 Like column three right here.
- 04:53 The reason I didn't filter this out up front is because if the file is long
- 04:56 enough, I wouldn't wanna accidentally filter out a row and
- 05:00 shift the entire pattern if maybe one of the vendor names is blank.
- 05:04 And now we're basically left with our data.
- 05:07 We know that this guy here is an American date format so
- 05:10 we're gonna change type on this guy here using locale, and set it to a date,
- 05:17 Again for English US, and we'll say OK.
- 05:23 We can then, of course, name the column dates.
- 05:26 We can name our next column vendor and we can name our final column amount
- 05:33 which right now appears to be formatted as text, so let's change it to a currency.
- 05:40 And at this point, I'm just gonna go and drop the visa of the front of this table
- 05:44 name and we'll call as transactions and we are good to go with a home close and load.
- 05:50 And we can land our data into the worksheet, and everything is good to go.
- 05:55 Whenever we get a new file, we just right click and refresh.
- 05:58 The key thing here I want you to remember is that this is a pattern and
- 06:02 there is a cheat sheet that goes with this module so
- 06:05 that you can actually use this again with all of those individual steps.
- 06:09 The key parts around this are adding the index column from zero,
- 06:12 building the integer divide using the magic number which is the first
- 06:16 row of the second transaction from the index column.
- 06:19 Transforming that index column to the modulo using the magic number again.
- 06:24 And then pivoting the modulocolumn while not aggregating
- 06:28 the original column of values.
- 06:31 At that point everything is done, it´s just cleanup time.
- 06:34 It's a beautiful pattern that you can use over and
- 06:36 over again whenever you see this kind of data arising inside your solutions.
Lesson notes are only available for subscribers.