Combine Data from Multiple Worksheets Using Power Query

Source: Link Author: Sumit Bansal

When combining data from different sheets using Power Query, it’s required to have the data in an Excel Table (or at least in named ranges). If the data is not in an Excel Table, the method shown here would not work.

Suppose you have four different sheets – East, West, North, and South.

Each of these worksheets has the data in an Excel Table, and the structure of the table is consistent (i.e., the headers are same).

This kind of data is extremely easy to combine using Power Query (which works really well with data in Excel Table).

For this technique to work best, it’s better to have names for your Excel Tables (work without it too, but it’s easier to use when the tables are named).

I have given the tables the following names: East_Data, West_Data, North_Data, and South_Data.

Here are the steps to combine multiple worksheets with Excel Tables using Power Query:

  • Go to the Data tab.
  • In the Get & Transform Data group, click on the ‘Get Data’ option.
  • Go the ‘From Other Sources’ option.
  • Click the ‘Blank Query’ option. This will open the Power Query editor.
  • In the Query editor, type the following formula in the formula bar: =Excel.CurrentWorkbook(). Note that the Power Query formulas are case sensitive, so you need to use the exact formula as mentioned (else you will get an error). 
  • Hit the Enter key. This will show you all the table names in the entire workbook (it will also show you the named ranges and/or connections in case it exists in the workbook).
  • [Optional Step] In this example, I want to combine all the tables. If you want to combine specific Excel Tables only, then you can click the drop-down icon in the name header and select the ones you want to combine. Similarly, if you have named ranges or connections, and you only want to combine tables, you can remove those named ranges as well.
  • In the Content header cell, click on the double pointed arrow.
  • Select the columns that you want to combine. If you want to combine all columns, make sure (Select All Columns) is checked.
  • Uncheck the ‘Use original column name as prefix’ option.
  • Click OK.

The above steps would combine the data from all the worksheets into one single table.

If you look closely, you’ll find the last column (rightmost) has the name of the Excel tables (East_Data, West_Data, North_Data, and South_Data). This is an identifier that tells us which record came from which Excel Table. This is also the reason I said it’s better to have descriptive names for the Excel tables.

Here are a few modifications you can do to the combined data in Power Query itself:

  1. Drag and place the Name column to the beginning.
  2. Remove the “_Data” from the name column (so you’re left with East, West, North, and South in the name column). To do this, right-click on the Name header and click on Replace Values. In the  Replace Values dialog box, replace _Data with a blank.
  3. Change the Data column to show only dates (and not the time). To do this, click the Date column header, go to the ‘Transform’ tab and change the Data type to Date.
  4. Rename the Query to ConsolidatedData.

Now that you have the combined data from all the worksheets in Power Query, you can load it in Excel – as a new table in a new worksheet.

To do this. follow the below steps:

  • Click the ‘File’ tab.
  • Click on Close and Load To.
  • In the Import Data dialog box, select Table and New worksheet options.
  • Click Ok.

The above steps would combine data from all the worksheets and give you that combined data in a new worksheet.

One Issue You Must Resolve when Using This Method

In case you have used the above method to combine all the tables in the workbook, you’re likely to face an issue.

See the number of rows of the combined data – 1304 (which is right).

Now, if I refresh the query, the number of rows changes to 2607. Refresh again and it will change to 3910.

Here is the problem.

Every time you refresh the query, it adds all the records in the original data to the combined data.

Note: You’ll face this issue only if you have used Power Query to combine ALL THE EXCEL TABLES in the workbook. In case you selected specific tables to be combined, you’ll not face this issue.

Let’s understand the cause of this problem and how to correct this.

When you refresh a query, it goes back and follows all the steps that we took to combine the data.

In the step where we used the formula =Excel.CurrentWorkbook(), it gave us a list of all the tables. This worked fine the first time as there were only four tables.

But when you refresh, there are five tables in the workbook – including the new table that Power Query inserted where we have the combined data.

So every time you refresh the query, apart from the four Excel Tables that we want to combine, it also adds the existing query table to the resulting data.

This is called recursion.

Here is how to solve this issue.

Once you insert =Excel.CurrentWorkbook() in the  Power Query formula bar and hit enter, you get a list of Excel Tables. To make sure you only get to combine the tables from the worksheet, you need to somehow filter only these tables that you want to combine and remove everything else.

Here are the steps to make sure you only have the required tables:

  • Click the drop-down and hover the cursor on Text Filters.
  • Click on the Contains option.
  • In the Filter Rows dialog box, enter _Data in the field next to the ‘contains’ option.
  • Click OK.

You may not see any change in the data, but doing this will prevent the resulting table from being added over again when the query is refreshed.

Note that in the above steps we have used “_Data” to filter as we named out tables that way. But what if your tables are not named consistently. What if all the table names are random and have nothing in common.

Here is the way to solve this – use the ‘does not equal’ filter and enter the name of the Query (which would be ConsolidatedData in our example). This will ensure that everything remains the same and the resulting query table which is created is filtered out.

Apart from the fact that Power Query makes this entire process of combining data from different sheets (or even the same sheet) quite easy, another benefit of using it that it makes it dynamic. If you add more records to any of the tables and refresh the Power Query, it will automatically give you the combined data.

Important Note: In the above example, the headers were same. In case the headers are different, Power Query will combine and create all the columns in the new table. If the data is available for that column, it will be shown, else it will show null.

Published by Amar Singh

Experienced in Microsoft Dynamics 365 (CE) Functional, Test Automation, Integration Testing, Data Migration Testing and Manual Testing on various applications in Microsoft Dynamics CRM, Microsoft. Have lead Dynamics 365 Functional Team, Manual Test Team, Automation Test Team and Data Migration Test Team.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website with WordPress.com
Get started
%d bloggers like this: