Locked lesson.
About this lesson
We will begin our journey by pulling some sales data into Power BI from a text file, as well as making some minor manipulations to the data.
Exercise files
Download this lesson’s related exercise files. You can download the source data files for the course from the resources section of your Lessons page.
GoSkills Power BI Source Data.zip443.6 KB Getting Data From a Text file - Completed.pbix
346.7 KB
Quick reference
Getting data from a Text file
Using Power BI to import data from a text (or CSV) file.
When to use
Use when you can’t connect directly to a database, but can download your source data to a text (or CSV) file.
Instructions
The general process is:
- Go to Power BI Desktop
- Get Data --> Text/CSV
- Locate the file and choose Edit when prompted
- Use Remove Rows --> Remove Top Rows to remove any unnecessary rows from the top
- Click “Use First Row as Headers” to promote the first row to headers
- Right click a column --> Remove to remove unnecessary columns
- Click the icons at the top of a column to change data types
- Right click a column --> Remove Errors to remove all rows with errors
- Click the Filter arrow at the top of a column to filter out values and blank rows
- Rename columns by double clicking them (or right click and choose rename)
- Rename your table on the right side
- Click Close & Apply to complete the query
Hints & tips
- Select multiple columns by holding down the CTRL or SHFT key while clicking them
- To set data types on multiple columns, select them, right click one and choose Change Type
- Always remove errors from a column before filtering!
- All steps are recorded on the right side so you can step through them
- Click the gear icon on any step to reconfigure it
- 00:04 So let's get started.
- 00:05 The first thing I did was download and install Power BI Desktop and
- 00:09 then I double-clicked on the icon to open it up to here, a blank project.
- 00:13 The first thing that I'm gonna need to do now is go and get some data.
- 00:17 So I'm gonna click the Get Data button, and I'm gonna go to Text or CSV.
- 00:21 And the reason I'm going here is because I've got a big BI system,
- 00:25 which doesn't allow me to connect to it directly.
- 00:27 So I've exported the data to a text file.
- 00:30 And I'm gonna find that text file under my C drive
- 00:32 in GoSkills > SalesTransactions and we'll say Open.
- 00:37 Now what will happen at this point is you'll be taken into a preview window.
- 00:41 I don't spend a lot of time here.
- 00:43 I'm gonna go click Edit, because I rarely want to accept the data as it's coming in.
- 00:47 So we'll say edit, and unfortunately it tries to open this window,
- 00:51 the new window, behind the existing one.
- 00:54 So I'm gonna press Alt+Tab, just flip windows here, and you'll notice that this
- 00:58 is what I was looking for, is a Window that shows me a preview of all of my data.
- 01:02 You'll notice of your left-hand side, we have a queries pane.
- 01:05 This will list all of the queries we're gonna build.
- 01:08 On the right-hand side,
- 01:09 we have a properties area where we can name the table we want.
- 01:11 And you know what, I don't like the name of this.
- 01:13 SalesTransactions, that's huge, let's just go with Sales.
- 01:16 So I'm gonna rename it.
- 01:17 So this is the name of the table that's gonna show up in my model.
- 01:20 I also have this applied steps area.
- 01:23 And this is really important, because this will list all of the steps for
- 01:25 everything that we do here.
- 01:27 The very first one is called Source, and if I click on that,
- 01:30 it will take me back into the raw, unfiltered table.
- 01:33 And you'll notice that beside it, there's a little gear icon.
- 01:35 So if I click the gear icon, It will take me back into the configuration step.
- 01:39 And it says, hey, this was a comma-separated values file and
- 01:41 here's the location.
- 01:43 So if you're opening any of the example files later on,
- 01:46 you could always come back, edit the query, click on the gears next to Source,
- 01:50 and change the file path to get it to work on your system.
- 01:53 Meanwhile, I'm gonna Cancel, and I'll select the last step,
- 01:56 because I don't wanna insert new steps between Source and Change Type.
- 02:00 Now, some key stuff that you wanna be aware of around this file.
- 02:04 This is a preview into the file.
- 02:06 The file is approximately 38,000 rows of data,
- 02:09 but you'll notice that we only see about a thousand at a time, and
- 02:12 the reason being is because this is a preview into the file.
- 02:15 What the Power BI team said when they're working with this particular technology is
- 02:19 it'd be terrible to bring the entire database in and
- 02:21 then make you operate on it locally.
- 02:23 So we'll just give you a preview because we believe that as you look down
- 02:26 all of your rows, eventually there's gonna be patterns emerging, and
- 02:30 that's what we're trying to find here.
- 02:32 So let's see what we can do with them.
- 02:33 The other thing you'll notice as we work through this particular tool,
- 02:37 which is called Power Query, if you're ever looking for it in another manner.
- 02:40 You'll notice that we have a rich array of commands on the Home tab, and
- 02:45 the Transform tab and the Add column tabs.
- 02:48 These are all about getting and transforming data to make it into a nice,
- 02:52 clean format for what you need to use.
- 02:54 So let's get started here.
- 02:55 The first nine rows don't look like they're gonna add any value to me at all.
- 02:59 So I'm gonna say Remove Rows > Remove Top Rows, and
- 03:02 I'm gonna choose remove the top nine rows, and I said OK.
- 03:07 This does nothing to the underlying file.
- 03:08 This is just a preview of what's going to happen when I stream the data in.
- 03:12 It will never compromise the original source, it's just what it's doing for
- 03:15 manipulation for me to use it here.
- 03:17 The next thing that I notice is the first row looks like it may be headers, so
- 03:21 I'm going to go and click the Use First Row as Headers button.
- 03:24 But you know what, I don't really like them.
- 03:26 So let's double-click on this guy here and let's rename it.
- 03:29 I'm gonna call this one Location Name.
- 03:32 And maybe the next one I'll call Street Address, and City.
- 03:43 We can have Province.
- 03:48 We will go with latitude, we'll just call it Lat.
- 03:51 And Long can be under for longitude, the POSChitDate.
- 03:56 All this POS stuff is for point of sale, we don't really need to see that.
- 04:00 That's why I figured I'd rename these things.
- 04:01 So we'll just scroll across here and see what else we have.
- 04:03 We've got an Hour, Category, We have Item Name.
- 04:15 And these last ones, Units Sold, Unit Price and TotalSale.
- 04:19 Well, total sale looks a little funny, so I could go and
- 04:22 just put a space in there as well, it's gonna look a little bit lot better.
- 04:26 Next piece, I look at, I'm not a really big fan of the data types through here,
- 04:29 notice, this is an A, B, C.
- 04:31 So I'm gonna change it, I'm gonna click on the A, B,
- 04:33 C and we'll change this to a whole number.
- 04:36 Unit Price and Total Sale, if I select one, hold down my Ctrl key and
- 04:40 select the next, I can change both of these at the same time by a right-click,
- 04:45 we can go to Change Type, and I can choose which one I want to use.
- 04:48 I'm going to use a Fixed Decimal Number here.
- 04:51 The Fixed Decimal Number will put a dollar sign at the top.
- 04:53 It's not really currency,
- 04:54 what it means is that a fixed decimal has four decimal places.
- 04:58 A regular decimal number can hold a lot more, okay, so
- 05:00 we're not super worried about that.
- 05:02 I'll come back over, lat and long, let us change these out to a decimal number,
- 05:07 because these have more than four decimals.
- 05:10 And then we get to this Date, dates are special.
- 05:13 We're gonna click the ABC.
- 05:15 But we're gonna choose,
- 05:15 instead of just using date we're gonna choose to convert Using Locale.
- 05:19 And the reason we're using locale is we're telling the data set
- 05:23 where is this data coming from.
- 05:25 I am going to go and click on this, press F to get down to the Englishes, and
- 05:29 say that this data came from an English US data set.
- 05:34 When I do this, it's going to interpret these dates as English US dates to
- 05:38 begin with to say, these dates are coming from an English US system,
- 05:41 convert them into the format that I use on my side.
- 05:45 So you'll see for me, it's actually put it as month, day,
- 05:47 year because I actually do follow the US standard with my data.
- 05:51 If you're working in a different format though, this would come back with
- 05:54 potentially if you go with year, month, day, it might come back with 2015, 01, 02.
- 05:58 If you go with day, month, year, it will be 2/1/2015.
- 06:02 So if your results look different to mine, that's okay.
- 06:05 But the big key here is that I can now email this or
- 06:07 share this with you because it knows that the original data source is coming from
- 06:11 a US English system, which is important.
- 06:14 Doing this though has triggered me some problems.
- 06:16 If I go and scroll down my list here a little bit, you'll notice that when we get
- 06:20 down to around row 56 here, we've got some problems, I've got this error.
- 06:25 And you'll notice if I mouse over error, it kinda highlights it and
- 06:27 underlines it in gold.
- 06:28 If I clicked on that, it would drill in.
- 06:30 If I click in the white space beside,
- 06:32 it tells me I had an error because I can't covert POSChitDate into a date.
- 06:37 No problem I'm going to go and right-click on this column and
- 06:41 I'm gonna say, Remove Errors.
- 06:45 That will remove all the rows with errors and
- 06:47 when I go back down here again, you'll notice that they're gone.
- 06:50 I can now filter these guys out to remove the null rows.
- 06:54 And I've now removed every single row that was a compromise problem in this data set.
- 06:59 I've now got a very nice data set that runs contiguously.
- 07:02 I can go and hit Close and apply.
- 07:05 And this will now load 38,000 and some odd records into my model, which I can see by
- 07:11 clicking on the data view, 38,861 rows in a nice little data table.
Lesson notes are only available for subscribers.