Locked lesson.
About this lesson
You may encounter missing data while doing analysis. What should you do? In this video, we'll discuss some helpful alternatives.
Exercise files
Download this lesson’s related exercise files.
Dealing With Missing Data.docx57.3 KB Dealing With Missing Data - Solution.docx
56.6 KB
Quick reference
Dealing With Missing Data
Often times our data sets contain missing data. In this video we'll learn how to drop missing data, or change missing data.
When to use
Use these two methods to either drop Null Data, or change it to something else.
Instructions
Given a DataFrame named df. To drop Null Objects (NaN) from a row:
df.dropna()
To drop Null Objects from a column:
df.dropna(axis=1)
To drop a row or column based on a certain number of Null Objects:
df.dropna(thresh=3) #only instances of 3 Null Objects
To replace Null Objects:
df.fillna(value="Bob") #replaces NaN with "Bob"
To replace based on the column average:
df.fillna(value=df['A'].mean())
Hints & tips
- To Drop Null Objects: df.dropna()
- To Replace Null Objects: df.fillna(value="Bob")
Lesson notes are only available for subscribers.