Locked lesson.
About this lesson
Exercise files
Download this lesson’s related exercise files.
05-01-Import Data from the Web-Start.xlsx16.1 KB 05-01-Import Data from the Web-Complete.xlsx
22.5 KB 5.01 import-data-from-the-web - Exercise.docx
65.3 KB Exercise - Import Data from the Web.xlsx
15.6 KB 5.01 import-data-from-the-web - Exercise solution.docx
152.8 KB Exercise Solution - Import Data from the Web.xlsx
21.6 KB
Quick reference
Import Data from the Web
Learn how to import data from the web the easy way.
When to use
We use this method whenever we want to import data from the web into Excel without copying/pasting and then reformatting.
Instructions
Often we need to import data from a website into Excel. A common mistake is to simply copy the data from the website and then paste it into Excel. The issue with this method is that often it brings with it rogue formatting, random symbols, and other things that don't belong.
To minimize the amount of data cleaning and reformatting we need to do, we can use the Get and Transform utility in Excel which will import data from a website and format it as a table.
Import Data from the Web
- Go to the website that contains the data to import.
- Click in the URL bar and press CTRL+C to copy the URL.
- Go back to Excel.
- From the Data tab, in the Get and Transform Data group, click From Web.
- Press CTRL+V to paste the URL.
- Click OK.
Excel will scan the webpage for any tables and list them on the left-hand side of the Navigator pane. We need to click through each table until we find the one that contains the data we want to import.
- Select the table.
At this stage, we can choose to import the data directly into Excel or we can choose to transform (clean) the data in Power Query.
- Click Transform Data to open Power Query.
In Power Query, we can tidy up the data and remove anything that shouldn't be there before loading it into Excel.
Cleaning Data in Power Query
Our dataset has a couple of issues that we need to fix. For example, the column headings are repeated in the first 2 rows.
- From the Home tab, in the Reduce Rows group, click Remove Rows.
- Click Remove Top Rows.
- In the Remove Top Rows dialog box, enter 2 in the Number of rows field.
- Click OK.
Next, we need to check that each column is using the correct data type.
- Click the icon to the left of the column name.
- Change the data type accordingly.
Load the Data into Excel
Once we have cleaned our data in Power Query, we can load it into Excel.
- From the Home tab, in the Close group, click Close & Load.
- Click Close and Load To.
We can choose to load our data into an Excel Table, a PivotTable Report, a PivotChart, or we can Create Connection.
- Choose Table from the List.
- Click OK.
The data will now load into Excel as a table ready for analysis.
Hints & tips
- When we use this method, Excel creates a connection between the website data and the spreadsheet. If anything changes on the website, we just need to refresh the table to update it.
- 00:04 In this lesson, we're going to take a look at how we can import data from the web,
- 00:08 because this is a task that you might find yourself doing on a fairly frequent basis.
- 00:13 We see something online that we want to add into our spreadsheet, and
- 00:17 most of the time, people don't know how to do this the correct way.
- 00:21 So, I'm going to start out in this lesson by showing you the method that I see most
- 00:26 people using, and then I'm going to show you a better way of doing this.
- 00:30 So, let's dive straight in.
- 00:32 So, I've just jumped onto Wikipedia and
- 00:35 I've got myself a list of the largest companies in the world by revenue and
- 00:40 you can see this list under me in a big long table.
- 00:44 Now maybe I want to import this data into an Excel spreadsheet and
- 00:48 then perform some kind of analysis.
- 00:51 Now the method that I see a lot of people using is simply copying and pasting.
- 00:56 So what I see people doing is kind of clicking here and
- 01:00 dragging all the way down to get all of that data.
- 01:03 I'm going to go all the way down to the bottom, Ctrl C to copy, and
- 01:08 then in general, we'll jump back to Excel, Ctrl V to paste.
- 01:12 Now that's going to pull the data across in a not great format.
- 01:16 So you can see here it's brought across some formatting from the original webpage.
- 01:20 We've got some background fills on cells, we've got some merged cells here where
- 01:25 we have USD millions and we've got lots of links in here as well.
- 01:29 So in order to be able to analyze this data,
- 01:32 I'm going to need to do quite a bit of tidying up.
- 01:34 Now of course, when you paste something from the web into an Excel spreadsheet,
- 01:39 you do get some additional paste options.
- 01:41 And if you take a look right in the bottom corner, I have this little tag pop up and
- 01:45 if I click it, I get a couple of choices here.
- 01:47 Now the only choices that I've got are to keep the source formatting,
- 01:51 which is basically what we have here, or to match the destination formatting.
- 01:55 So even if I choose Match Destination Formatting, it puts it into a nicer
- 02:00 format, but I'm still going to need to do some tidying up here.
- 02:04 I'm going to need to resize all of the columns.
- 02:06 I'm going to need to check that the format has been applied correctly to each column.
- 02:11 I need to sort out this USD millions,
- 02:13 because that's just sitting up there looking like a merged cell.
- 02:16 And then finally, I'll probably put it into a table.
- 02:19 Now, what would be much quicker is to get to the table stage immediately.
- 02:24 So let me show you another way of doing this that's a lot quicker and
- 02:27 a lot tidier.
- 02:27 So let's Ctrl Z, just to remove what we did and
- 02:30 we're going to jump back to our webpage.
- 02:32 Now, instead of copying and pasting the data,
- 02:36 all you need to do is basically copy the URL.
- 02:39 So if we go up to the URL bar and click, let's Ctrl C to copy,
- 02:43 we're going to jump back to Excel, go to the Data tab, and in the Get and
- 02:48 Transform Data group we're going to say from web.
- 02:52 Now all we need to do here is basically paste in the URL, let's click on OK.
- 02:58 Excel is going to go away and take a look at that web page and
- 03:01 it's basically going to show us all of the tables contained on that web page.
- 03:05 So if we click through some of these,
- 03:07 we should be able to find that table that we want.
- 03:10 So this is it.
- 03:12 It's the one that contains all of the company names and the revenue.
- 03:15 So once we found this, we can choose to transform the data in Power Query or
- 03:20 we can simply load it directly into Excel.
- 03:23 Now my recommendation is to click on Transform Data because this is going to
- 03:28 open up the dataset in Power Query and we can check our columns and
- 03:32 do any tidying up in here before we load it into Excel.
- 03:35 So let's take a look at what we have here.
- 03:37 I can see I've got the column headings, those look good, but
- 03:41 notice that they are repeated in the next two rows.
- 03:44 So I'm going to click on the first row, and then let's jump up to the Home tab,
- 03:50 the Remove Rows drop down and I'm going to say Remove Top Row.
- 03:54 And then I can choose the number of rows that I want to remove.
- 03:57 Now I have those column headings repeated twice in the actual table of data, so
- 04:01 I'm going to remove the top two rows, let's click on OK.
- 04:05 And that looks a lot better.
- 04:07 Now I can go through and just check to make sure the data types are correct.
- 04:10 So if I click here, I can see that this is a text data type, well,
- 04:15 it's actually a whole number, so let's change that.
- 04:19 This is text that's fine, this is text.
- 04:21 Revenue, well, this is, let's say currency, let's change that.
- 04:26 This one here is also currency.
- 04:29 Number of employees, well that needs to be a whole number.
- 04:33 We then have text, that's fine, and text again.
- 04:36 So, I've now very quickly been able to just make those small changes and
- 04:40 now I can load it back into Excel.
- 04:42 So let's click on Close and Load, I can choose to close and
- 04:46 load to a table, a PivotTable or a PivotChart.
- 04:50 So if I wanted to create a PivotTable, based off of this data,
- 04:54 I can do it directly from here.
- 04:56 Now we're just going to load it into a table, so let's select that option and
- 05:01 click on OK.
- 05:02 It's going to grab that data and check it out.
- 05:05 We now have a perfectly clean data set that's already been put into a table.
- 05:11 And the cool thing about this is that it's created a connection.
- 05:14 So if anything gets updated in the source data,
- 05:17 we can simply refresh our table and it's going to pull that new data through.
- 05:22 So this is a much better way of importing data from the web,
- 05:26 as opposed to copying and pasting it in.
Lesson notes are only available for subscribers.