How to automatically prepare and publish Tableau extracts with SSIS
There are number of ways to refresh the data connected to your Tableau dashboards; for instance refreshing manually or setting up a schedule. However, if you depend on an ETL to make the latest data available and you want to minimise data latency these may not be efficient approaches. In this blog I will outline your options, and how to use the tabcmd tool in Tableau.
Say your ETL runs overnight and finishes between 4 and 6 am, when you’re dealing with big dashboards your Tableau refresh will take 1 to 2 hours to obtain all data needed.
OPTION 1: is to run the refresh manually when you get to work early in the morning. But then users need to wait until Tableau finishes importing the data.
OPTION 2: is to program a Tableau schedule to run at 5 or 7 am. However, there is no guarantee that your ETL will finish before then. Fortunately, Tableau Server provides an option 3!
OPTION 3: tabcmd. Tabcmd is an administration tool included on Tableau Server. It allows some very useful command line functionality to apply some basic but important administration tasks remotely on any accessible Tableau Server. In a nutshell, tabcmd allows you to manage sites, groups, users, run Tableau schedules, publish dashboards and execute extracts refresh on workbooks or data sources.
Examples of how to use tabcmd can be found on Tableau Server website, but in this blog I’ll show the necessary steps needed to refresh a dashboard from an SSIS package that runs on a different Server with some different configurations.
1. Tabcmd installation program can be found on your Tableau Server folder.
Tabcmd can be installed on any folder but its path may need to be added to the PATH windows environment variable manually.
2. In order to refresh a workbook, on a command prompt window:
- Login – see the example below using Active Directory
- tabcmd login -s http://localhost:8000/ -u "domain\user" -p password
- Refresh – see the example below using your URL
- Tabcmd tabcmd refreshextracts – url "My Workbook" –synchronous
- Synchronous will guaranty that the SSIS task finishes after the refresh is complete.
- Include the commands on a batch file and then test the connectivity
3. Once you have tested the connectivity, you can then integrate it on SSIS on an “Execute Process Task” embedded on a batch file
Since all the steps can be embedded on a simple batch file you can include them on any ETL tool that allows interaction with Windows. The main advantage of this approach is an increased data availability, as you can now integrate Tableau into your ETL process.