- HD
- 720p
- 540p
- 360p
- 0.50x
- 0.75x
- 1.00x
- 1.25x
- 1.50x
- 1.75x
- 2.00x
We hope you enjoyed this lesson.
Cool lesson, huh? Share it with your friends
About this lesson
DataFrames are like spreadsheets, and in this video, we start building and using them.
Exercise files
Download this lesson’s related exercise files.
Pandas DataFrames57.2 KB Pandas DataFrames - Solution
55.7 KB
Quick reference
Pandas DataFrames
DataFrames are the main workhorse of Pandas that we'll be using throughout the rest of the course.
When to use
You'll use DataFrames whenever you want to visualize data in spreadsheet format.
Instructions
To create Random Data, import randn:
from numpy.random import randn
To create dummy random data:
my_data = randn(6,2)
Where randn(rows,columns).
To create a DataFrame, first designate rows, and columns, then put it all together:
my_data = randn(4,3)
my_rows = ["A", "B", "C", "D"]
my_cols = ["Mon", "Tues", "Wed"]
my_df = pd.DataFrame(my_data, my_rows, my_cols)
Hints & tips
- from numpy.random import randn
- Random Data: my_data = randn(6,2)
- Create DataFrame: my_df = pd.DataFrame(my_data, my_rows, my_cols)
- 00:05 So in this video, I want to start to talk about data frames.
- 00:08 And data frames is where this starts to get fun, right?
- 00:10 Up until now we had just been dealing with sort of strange numbers and
- 00:14 they've been strung together in not a very visually appealing way.
- 00:17 But now we're going to be working with data frames.
- 00:19 And data frames, like I mentioned earlier, are like spreadsheets, they look
- 00:23 sort of like Excel, they're much more visually pleasing to deal with, right?
- 00:27 So let's just jump in and start building one.
- 00:30 So before we get started, I want to add one more thing up here.
- 00:33 Let's go from numpy.random
- 00:38 import randn, rand d.
- 00:42 Shift+Enter to run this, and this will allow us to create some random numbers.
- 00:46 So we're going to create a data frame, we need a bunch of random numbers for
- 00:49 that, and so that's all that is.
- 00:51 So let's create some data.
- 00:54 So let's go, my_data, and let's set this equal to randn,
- 00:59 and then (4,3).
- 01:03 And if we run this, we call my_data here, well,
- 01:06 you can just see we've created an array.
- 01:09 And it has one, two, three, four, rows, and one, two, three, columns.
- 01:15 So 4 and 3, right?
- 01:18 So that's cool, we just created some dummy arrays,
- 01:20 now your numbers will be different because it's randomly generating them, right?
- 01:24 So, okay, now we want, think of a spreadsheet,
- 01:27 it has columns and it has rows.
- 01:29 And usually those columns have headers and those rows have headers.
- 01:32 So let's create some of those.
- 01:34 Let's go, my_rows, and let's just create a Python dictionary.
- 01:39 And let's set these equal
- 01:44 to A and B and C and D.
- 01:48 Remember, we need four of them because we designated four there.
- 01:51 And let's create my_cols, my columns.
- 01:55 And again, this is going to be a Python dictionary.
- 01:57 So let's just call this one Monday, and we can go Tuesday, Wednesday.
- 02:04 And again, we're going to have three columns,
- 02:05 because we designated that right there, right?
- 02:08 So if we Shift+Enter, okay,
- 02:09 we've got these things, now we want to create an actual data frame.
- 02:13 And to do that, let's just name this my_df,
- 02:18 I'm using this my underscore thing, you can call these anything you want.
- 02:21 I just kind of like to call them, my whatever.
- 02:24 And set this equal to, now this is going to be a panda.DataFrame.
- 02:29 Remember, we're importing pandas as pd, so from now on,
- 02:32 we can do panda stuff, like referencing pd.
- 02:37 This is a function, and if we hit the Shift and the Tab key at the same
- 02:42 time on our keyboard, we can see what arguments get passed into here.
- 02:46 Now this is not a pandas thing, this is a Jupyter notebooks thing.
- 02:48 It'll always do this when you hit Shift+Tab on your keyboard.
- 02:52 You can see it wants us to pass the data, the index, which are the rows, and
- 02:57 the columns, and some other things we don't really care about.
- 02:59 So let's go ahead and do that.
- 03:01 So the data, we want this to be, my_data, just paste this in.
- 03:06 The rows, we want this to be my_rows, and the columns, we want this to be my_cols.
- 03:11 So if we Shift+Enter, nothing actually happens.
- 03:13 To run this thing, we have to call my_df, like this.
- 03:17 And when we do, boom, we get this nice spreadsheet looking thing.
- 03:21 And if we hover over things, right, it highlights.
- 03:26 Now, this is much,
- 03:26 much more sort of visually appealing than the NumPy arrays we looked at earlier.
- 03:31 And even the series we looked at in the last video,
- 03:34 we're starting to get somewhere here, right?
- 03:35 This is kind of cool.
- 03:36 Now, your display may look a little bit different.
- 03:38 I'm using the Firefox browser and this is how this looks in this browser.
- 03:42 Sometimes they have boxes around them,
- 03:45 just depending on your screen resolution or whatever, so very cool.
- 03:48 So, notice we passed in all of our stuff like this, right?
- 03:53 This just makes it easier, but we could actually pass in the data itself.
- 03:57 Let's actually change this from B to Z, just so
- 04:00 we can see the change when we run this.
- 04:03 So if we just copy this whole thing, instead of passing in the variable name,
- 04:06 we could just pass in the data itself.
- 04:08 Now, this is obviously not something you're usually going to want to do because
- 04:12 you'll often have a lot of data, right?
- 04:14 So it's hard to read if you try and pass in the data like this, but
- 04:17 we've just got a few, four things here, four rows, so we can do it like this.
- 04:22 And if we run this again,
- 04:23 you see now this has been changed to Z, because we made this change right there.
- 04:27 And very, very cool.
- 04:28 I'm just going to go ahead and change this back real quick.
- 04:32 And let's change this back to B, Shift+Enter to run this,
- 04:36 Shift+Enter to run this, Shift+Enter to run this, and we're back to B.
- 04:41 So those are data frames, right?
- 04:43 Not too complicated, pretty simple to make these.
- 04:46 Very cool to look at, you can do some things with them.
- 04:48 It's really fun.
- 04:49 So in the next video, we'll start to look at these columns.
- 04:52 We'll look at some things we can do with those, and that'll be in the next video.
Lesson notes are only available for subscribers.