Locked lesson.
About this lesson
Let's discuss multi-indexes: what they are, how they work, and how to select data from them.
Exercise files
Download this lesson’s related exercise files.
Selecting Specific Cells From Multi-Index.docx56.8 KB Selecting Specific Cells From Multi-Index - Solution.docx
58.8 KB
Quick reference
Selecting Specific Cells From Multi-Index
Multi-Index DataFrames are DataFrames with multiple indexes (row headers).
When to use
Use them to further categorize your row headers into different categories.
Instructions
To create a multi-index:
outer = ['A1', 'A1', 'A1', 'A2', 'A2', 'A2'] # Create Outer Index
inner = [1,2,3,1,2,3] # Create Inner Index
multi = list(zip(outer,inner))
multi = pd.MultiIndex.from_tuples(multi)
new_df = pd.DataFrame(randn(6,2), multi, ['Mon', 'Fri'])
To pull data from a multi-index dataframe:
new_df.loc['A1'] #pull data from one outer index
new_df.loc['A1'].loc[2] # pull data from a specific inner row of an outer index
new_df.loc['A1'].loc[2].loc['Tues'] # pull one specific point of data
Hints & tips
- Multi-Index Data Frames let us further categories our rows
- Don't get hung up on them :-)
- 00:05 Okay, the last video we looked at changing our indexes In this video,
- 00:09 I want to look at multi indexes and how to select data from them.
- 00:13 So this is a new concept having multiple indexes, indices I guess you would call
- 00:18 them, there we have A, B, C, D as our index, we can also have outer indexes.
- 00:23 So we may categorize A and B together in one index, and C and D in another one.
- 00:28 And then how do we pull data out of data frame with a multi-index?
- 00:34 So first off, let's build out a multi-indexes and
- 00:36 it's going to take a little bit of work here.
- 00:38 So let's create an outer and
- 00:43 A-list, let's go A1, A1 and
- 00:48 A1, A2, A2, and A2.
- 00:52 And then we also want an inner, and let's create another list,
- 00:58 and this could just be 1, 2, 3, and 1, 2, 3.
- 01:03 Now this is going to become apparent what I'm doing with all these things in just
- 01:06 a second, just kind of stick with me here for a minute.
- 01:09 So now we want to create a multi-index, so I'm just going to call this multi,
- 01:14 call it anything you want, really, and we want to list and then zip.
- 01:19 And then we want to list our outer and our inner.
- 01:25 So this list function and the zip function,
- 01:27 we really don't care what those are, it's more of an advanced topic.
- 01:30 And I just don't want to get into it at this point,
- 01:34 it's just going to be confusing.
- 01:36 Let's just say that this is how we create a multi-index and
- 01:39 kind of leave it at that for now.
- 01:41 So now we can create the actual multioindex by calling multi,
- 01:47 and then setting that equal to a pd.MultiIndex and
- 01:52 this is from_tuples, and then we just want to pass in multi.
- 02:00 So if we now run multi, we can see what this is,
- 02:03 we've got just this fancy multi-index.
- 02:06 And we've got this matrix here with A1, A1, A1, 1, 2, 3, A2, A2, A2, 1, 2, 3.
- 02:14 So okay, that's kind of neat,
- 02:15 but we're not quite done yet, now we need to create a data frame out of this.
- 02:21 So let's create a new, let's just call this new_df, and
- 02:25 let's set this equal to pd.DataFrame as we've done so many times already.
- 02:32 And let's put in some random numbers here, So
- 02:36 let's go randn, we want say 6 rows and 2 columns.
- 02:41 And for the index, we want our multi-index.
- 02:45 And for the column headers, let's just stick with,
- 02:48 let's see, we've been doing this Monday, Tuesday thing,
- 02:52 stick with that sort of convention, Monday, Tuesday, okay?
- 02:56 So now, if we run our new_df,
- 02:59 we see here is fruits of our labour, this multi index data frame.
- 03:05 So it's just like the data frames we've been working with with, 1, 2, 3, 1,2,
- 03:09 3, it's just now they're grouped in another index, it's a multi-index.
- 03:13 And if you hover over these you can see it covers 1, 2, 3, and A2 covers 1, 2, 3.
- 03:19 So again, don't get hung up on how we created this, what this stuff is,
- 03:25 I just want you to sort of be aware that these are multi-indexes,
- 03:30 multi-indices, and that's cool.
- 03:33 So now, the question becomes, how do we select data out of these?
- 03:37 We've been selecting data out of our other data frames fairly easily.
- 03:39 But now we've got multiple indexes multiple indices, I guess you would say,
- 03:44 how do we grab data out of there?
- 03:46 So we can use our location function like we've been doing.
- 03:49 So let's go new_df.loc, and let's say we want to
- 03:53 grab everything in A1, we could just go like that.
- 03:59 So we see 1, 2, 3 for Monday and Tuesday, 1, 2,
- 04:04 3, and -0.21, -0.21, and here we have -0.47, -0.47.
- 04:11 So spot checking seems to suggest that we grabbed this stuff.
- 04:17 Now we can dive even deeper, say we want everything from row 2 out of here.
- 04:22 Well, we can slap on another location and just call row 2.
- 04:27 And so now we get Monday and Tuesday for row two, so
- 04:29 that should just be these two things.
- 04:32 And yep, here we go those two things, and we can dive in even deeper.
- 04:37 Python is object oriented, we could just keep slapping objects on here.
- 04:41 So location and then what we want,
- 04:44 let's say we want Tuesday, we run this, that should be this guy right here.
- 04:50 And sure enough, it's this guy right here, so
- 04:54 -0.4262, and that's Tuesday, row 2 in A1.
- 05:00 So if we come up here, we could say Tuesday,
- 05:04 row 2, and A1, there it is, 0.4262.
- 05:09 So those are multi-indexes, multi-indeces, I guess, I just want to say multi-indexes,
- 05:14 I know that's wrong, but you know what I'm talking about.
- 05:18 And that's how to grab specific cells or anything out of them, and
- 05:22 that's how to select things from multi-indexes.
- 05:25 Again, don't get caught up on this,
- 05:27 I just wanted to sort of introduce a slightly more advanced thing just to
- 05:32 hopefully make using a regular data frames a little bit easier to understand.
- 05:37 If you kind of try and wrap your brain around something a little
- 05:40 more complicated, oftentimes, the easier thing then becomes easier to understand.
- 05:43 So hopefully that will help with that, and just introducing a new concept for us.
- 05:48 So that's all for this video, in the next video, we'll look at missing data,
- 05:52 what to do when some of our data is missing, or we have null objects, and
- 05:55 that'll be in the next video.
Lesson notes are only available for subscribers.