Import multiple excel sheets and files to Alteryx with different schemas

19 Apr 2017

How to import multiple Excel sheets and files with different schemas in Alteryx

Bingqian Gao

*Blog image by Jakub Szepietowski

Suppose you want to import multiple sheets from an Excel file that have the same structure. A common approach would be to use a Dynamic Input tool in Alteryx. In another scenario, imagine you want to import multiple Excel files with the same structure. A go-to solution is to use a Union tool or wildcard. However, these solutions only work well on one condition – the schema of all the sheets or files have to be identical. The sheet or file will be skipped if it has a different schema from the first sheet or file read by the Input tool. This prerequisite can be a bit tricky to meet as sometimes even though the structure of the sheets/files may look the same, they may have a different schema due to various reasons (e.g. empty cells that contain space).

A quick, simple workaround using a Batch Macro

So how can we ensure to batch import every Excel sheet and file in Alteryx regardless of their different schemas?

This can be done with a very simple Batch Macro.

For those of you who don't have the time to create a Batch Macro from scratch, you can simply download the packaged workflow I built (link at the end of this blog). Once you download it, follow the three steps below:

  1. Re-point the Input tool in the "Combine data with Batch import" workflow to the Excel file you would like to import
  2. Re-insert the Batch Macro (as it will show as missing when the directory changes) from the downloaded package and choose the right input fields for File name and Sheet name from the dropdown
  3. Run the workflow and check the result 

For those who are keen to learn, read on. In the following step-by-step guide, I'm am using 'multiple sheets' scenario as an example, but importing multiple files would use the same principle.

Guide to successfully import multiple Excel sheets and files at once

Step 1. Create a Batch Macro to batch import multiple sheets

1.1 ADD TOOLS

To create a Batch Macro, start by creating a new workflow with an Input Data tool. Point it to the file you'd like to import and select one of the sheets. (You will see later that it actually doesn't matter what Excel file you point it to, as the file name and sheet name will be updated.) Select "Full Path" as the value for "Output File Name as Field". This is to ensure the sheet names are included in the output.

Creating a batch macro step 1

Then drag a Control Parameter tool onto the canvas and connect it to the Input tool with the lightning bolt (see Figure 2).

Notice here we want to update two fields in the Input tool – File name and Sheet name, so we need to add a second Control Parameter and connect it to the same Input tool. Name the two Control Parameter tools "File name" and "Sheet name", respectively, so that you can distinguish the input fields later. Now connect the output to a Macro Output.

Action tools automatically generated after connecting control parameter tools to input too

 (Now if you check the Workflow Type, you'll see it's automatically changed from Standard Workflow to Batch Macro.)

Workflow type changed from standard workflow to batch macro automatically

1.2 CONFIGURE INPUT

Click the Action tool connecting Control Parameter for File name ( in Figure 2). Select "Update Value (Default)" in the "Select an action type" dropdown. In the window shown of Figure 4 below, select File – value ="..." and then tick the box at the bottom that says "Replace a specific string:". Remove the sheet name in that string, which is everything from the pipe till the end of the string after the file name. It means only the file path will be updated by this Parameter Control.

Configuring the action tool for 'File name'

Now click the Action tool connecting Control Parameter for Sheet name ( in Figure 2 above). Select "Update Value (Default)" in the dropdown. In the window shown on Figure 5 below, select File – value ="..." and then tick the box at the bottom that says "Replace a specific string:". Only keep the sheet name in that string, which is the text in between ' and $' after the pipe at the end of the file name (in this example the sheet name is Austria). It means only the Sheet name will be updated by this Parameter Control as the Batch Macro loops through the multiple sheets.

Configuring the action tool for 'Sheet name'

1.3 CONFIGURE OUTPUT

Now go to View -> Interface Designer -> Properties (left hand side panel). As illustrated in Figure 6, in "Output Mode", choose "Auto Configure by Name..." or "Auto Configure by Position..." to union output from each sheet by field names or field position at the end of the iterations. Once you've done this, it will not complain about different schema in different sheets or files. This is the key for combining tables with different schema.

Figure 6. Configuring output to combine tables with different schema

Note: "Auto Configure by Name" requires the headers to be the same across all sheets/files. If the headers are different but the tables have the same structure across sheets/files (eg. Date is always in column A), use "Auto Configure by Position". In the batch macro, add a select tool to move FileName to the top to make sure the FileName column is always at the same position.

Add select tool to move 'File name' to the top if union by position

Save the Batch Macro and it's good to go!

Step 2. Use the Batch Macro in a standard workflow

The standard workflow I created has two parts – import & combine data, and extract the sheet names.

Two parts of the standard workflow – importing & combining data and extracting sheet names

As shown in Figure 9 below, create a new workflow with an Input file pointing at the file you'd like to import. Choose "<List of Sheet Names>" for "Table or Query", and "Full Path" for "Output File Name as Field".

Configuring input for the file to to be imported

The "Sheet Names" column generated will be used as the input field for Sheet name in the Batch Macro. To get the input for File name, use Text to Columns (| as delimiter, split into 2 columns) to extract the file full path from FileName column. I then renamed the FileName1 column "File Name", which we will use as input for File name in the Batch Macro.

Next, connect the Batch Macro. From the dropdown lists select the correct input field for File name and Sheet name (see Figure 10).

Connecting the Batch Macro to input fields

The workflow after that is basically a standard Alteryx workflow to extract and add the sheet names back to the dataset in an extra column (Country in this case). Now you have a reusable Batch Macro and a repeatable workflow to import multiple Excel tables even if they have different schema.

Additional notes:

In the example above, I've shown how to import multiple sheets with different schemas. To adjust the Batch Macro in order to import multiple files, simply remove one of the Parameter Control (see Figure 11). And in the Action configuration for the remaining Parameter Control, select "Update Input Data Tool" in the "Select an action type" dropdown. 

Adjusting 'Action Configuration' in the Batch Macro to import multiple files

In the standard workflow, use a Directory tool to obtain full paths for all files you want to import and use that as the Batch Macro input field (see Figure 12).

Use the batch macro to import multiple files in a standard workflow

Note: this only works when you have only one sheet in every Excel file. If the Excel files have multiple sheets, you'll have to create full path that includes file path and sheet name in the format of File Path|'Sheet Name$'.

There you have it, a practical solution to import multiple Excel sheets and files regardless of different schemas all at once!

You can download the packaged workflow here

For more tips on using Alteryx, check out our blog and online resources.

Bingqian Gao

About the author

Bingqian is a Consultant at Concentra. She works with data across various industries including Consumer Goods, Telecom & Technology, Healthcare and Services. She enjoys finding solutions for data challenges in areas like data consolidation, modelling, and visualisation with tools such as Alteryx, Tableau, OrgVue and Excel. She also believes in the power of analytics in uncovering insights and helping to better inform decision making.

Subscribe to our newsletter

Get in touch
Get the edge

Go to webform

Subscribe to our newsletter

Get in touch
Get the edge

Go to webform