Alteryx How to: load multiple tabs from Excel
To find out more about Concentra's bespoke Analytics Services and Solutions see Services.
Let's take a look at some of the features that may not be glaringly obvious when you first use Alteryx, but that will increase productivity, save time and allow you to show off to your colleagues when creating your repeatable workflows.
Connecting to multiple sheets in excel
If you are connecting to an Excel file with multiple sheets (as highlighted in red in the screenshot) with the same structure (highlighted in green), there's no need to add multiple input tools and union the data. This can be achieved with just one Input tool combined with the Dynamic Input tool; let's show you how.
Connect to your Excel file using the Input tool as normal. When selecting the sheet name, choose <List of Sheet Names> (figure below). This will return all of the sheets within your Excel file.
Next, you can connect a Dynamic Input tool; here you'll have to point it at a template to let Alteryx know what structure of the file to expect. I suggest you point the tool at the same file, this time pointing at one of the sheets to act as the template.
Finally, configure the Dynamic Input tool to use the SheetNames column from the information we captured using the input tool. In this example, I've used the action to "Change File / Table Name."
Add a browse to your workflow, to see your data unioned.
With this approach, you can go from having any number of Input tools down to one concise workflow. Plus, if your source spreadsheet changes with an additional years' worth of data, this approach will automatically pick up the change and load it.
Points to note
Make sure your schema is the same on each tab, as Alteryx will warn you if there is a mismatch between your inputs by column position, but will continue to load if they have the same number of columns. Therefore, Alteryx will union columns even if they are mismatching in position, but will fail the load if there is a mismatch in the number of columns and provide you with a warning in the output window.
Named ranges appear as sheet names, and Alteryx will attempt to load named ranges as well as the sheet. As above, if the schema doesn't match, expect a failure. Yet, if the schema does contain the same number of columns, you may get duplicated data in your output.
For a simple workaround to import multiple sheets with different schemas, read this blog.
Connecting to multiple files with one input
In a similar vein to connecting to multiple sheets, Alteryx can also process multiple files with one input tool. This is achieved through the use of wildcards when connecting to files. Let's take a look at an example with my three files below:
- Without knowing this top tip, you would expect a workflow that has three input tools, all flowing into a union tool, similar to the following screenshot
To achieve this, simply edit the connection string for your file to contain either:
- An asterisk (*) to match zero or more variable characters
In the above example, any file with the prefix of "San-Francisco-" in your directory will be picked up by the load process.
- A Question mark (?) to match a single variable character
In the above example, any file with the prefix of "san-francisco-201" with a compulsory one additional character in your directory will be picked up in the load process
Points to note
The structure of the file must remain the same for Alteryx to load all files with one input tool.
Connecting to Named Ranges
As we touched upon earlier, named ranges within an excel file will show up as a sheet when you first connect to the file. Simply connect to your excel file as normal and select the named range.
This scenario will be useful if you are trying to read in data from an Excel file, where the column headers and data do not start from the first row.
Connecting to Specific Cells in Excel
Another useful tip to have in your arsenal is connecting to a specific set of manually specified cells; this gives you the ability to cut down rows and columns before bringing them into memory, or, similarly to named ranges, when the data in your excel file doesn't start from the 1st row.
To achieve this in Alteryx, connect to the excel file as normal. After selecting a sheet name, click on the SQL Editor Tab, and add the cell range in the format of CellStart:CellEnd at the end of the statement, but within the single quotes as shown below. Be sure to Test your query before clicking OK and previewing your results.
I hope that this blog was useful.