Discover analytics solutions from Concentra
Concentra’s analytics and business intelligence teams turn information into insight to give you the edge from your data. Learn more.
While Alteryx provides a wide range of out of the box predictive analytics, savvy users can stretch this boundary using the R Tool for custom analysis. R is a statistical computing language, which powers Alteryx’s built-in predictive analytics. To build your own predictive tool, you can use the R Tool (found in the Developer Toolbox) to author R code.
We have found that even if you have a deep understanding of R, the interface between Alteryx and R comes with its own challenges. To start overcoming these, we have put together a series of blogs, the first one will look at how to configure input for the R Tool.
This post expects familiarity with coding terms like assignment statements, conditional statements, loops, and objects, although not necessarily in the R context. When I reference an R-specific function or object, I will provide a link it to helpful materials for the non-R-programmer.
The R Tool is found in the Developer tab of the Alteryx toolbox. If you add this tool to an existing workflow and double-click it to open the Configuration pane, you will be greeted by a very blank (and for the non-R developer, very daunting) text box. Where do I start?
Fortunately, Alteryx provides some snippets of code that you can add using the “Insert Code”
drop down menu (top left).These snippets will help you input and output data, charts, and error messages.
The R Tool can accept many input datasets. Let’s start with handling a single dataset first. The R Tool numbers the datasets by default, so the first dataset you connect to the tool will be called “#1.” Want to change this? Rename the connector by double-clicking on it. In its Configuration pane, type your choice into the top input text box called “Name.”
The simplest way to input data
1) Drop down “Insert Code”
2) Choose “Read Input: #1”. Note that if you renamed your input, this will say “Read Input:[Whatever I chose as my new label]”.
3) Choose “As Data Frame”.
Result? You will get your first line of R code:
This is excellent, except for the fact that it doesn’t actually save your data for later use! To actually make use of the data, you will need to supplement this line with an assignment statement:
Now you have asked the Macro to save My.Dataset is a Data Frame object. In my experience, this is the easiest format of data to use in R, as it is exactly how you see your data in Alteryx – rows and columns. Now, you can use R functions applied to My.Dataset.
What about the other input options?
1) “Read Input: #1”+ “As List”:
Instead of a Data Frame object, now you get a List object. This is just another way of storing and referencing the data. I find it more cumbersome, probably because I use it less frequently. Same issue applies here; you will need to add your assignment statement to save the data.
2) “Read Input: #1” + “As Data Frame: Chunked”:
If you are using large datasets, this provides a while-loop that will input the data in chunks:
#write.Alteryx(a, 2) – a commented out output phrase
If you try running this, it will break! Why? Because the code doesn’t know how many chunks of data you will have. Alteryx has added !is.null(a) as a generic stopping rule. You will need to set up looping logic of your own. The following is an example that tests to see if enough records remain to create a full batch and then stops the process when records run out:
dim.data = dim(data) # Get the dimension of the data
num.rows = dim.data # Get the number of rows
# enough, set a to NULL (this will trigger the end of
The bottom line is that the best way to handle data in R is to use data frames. We can read now Alteryx data into the R Tool in this format.
*Header image of this blog is credited to Alteryx