We’ve talked before on this site about the usefulness of APIs (Application Programming Interfaces) in data analysis. Let’s apply those benefits to data stored in REDCap!

API Overview

There are two principal benefits to using APIs:

  • Data freshness
  • Scripted reproducibility

Data Freshness

With API calls, you get your data fresh every time, instead of relying on potentially stale downloads. Reaching into your file system to get a .csv or .json file to analyze can be tricky if you have multiple versions, and it’s easy to do analysis on an obsolete file if you’re not careful. For example, let’s say you have to run some analysis on data you’re collecting in REDCap, and you want to re-run this analysis every couple of weeks to see the latest figures. One way to do that is to manually export data to a .csv and save it to a file that your R or Python script will then open and work with. But REDCap likes to download files with a date stamp as part of the file name, so you have to remember to either change the name of the file to whatever standard file name you want your script to use, or change the file name each time in your script. You may also end up collecting multiple .csvs in various directories, each of which has a particular version of the data in REDCap. This can easily become overwhelming and cause confusion.

What’s a better approach? Reach into the REDCap database directly each time you run your script, so that you know you’re using the most up-to-date data. Read on to learn how to do this!

Scripted Reproducibility

Another problem with using downloaded files from REDCap is that this method requires unscripted, point-and-click manual work. If you were to document this carefully, you’d have to give several steps, like where to log in to REDCap, which project to use, the .csv download settings (which fields or forms to download), file naming conventions, and where to put the file. Most of us don’t go into this level of detail in our manual workflows, for good reason! It’s tiresome, and we know that sometimes things change in the look and feel of a website, so including screenshots and detailed instructions about where to look for a link or how to highlight multiple fields is a lot of work for something that might have slightly different steps next week or next month.

A better approach is to use a script that uses an API call. First of all, it’s scripted, which means no manual steps to write up in a Word document or add to a GitHub repo. Also, the typical API has a standard interface that will rarely, if ever, change. API access may improve over time, adding new features, but it’s very infrequent that an API will radically change and remove options, rendering your script unusable. The same half-dozen lines of code you use to access your data will almost always be stable for months or years, and if you do need to change it, you’re only changing that small chunk of code, instead of a step-by-step document with words and images that guide a manual effort.

API Keys

APIs are automated, which means they won’t rely on you logging in manually, adding a user name and password. API calls have to run without human intervention, which means you need to provide your script with credentials that show you are allowed to log in and see the data you’re accessing. But obviously you don’t want to put your user name and password in a script. Your user name and password open a lot of doors at CHOP, from your email to your payroll information to Epic. You want to isolate just your access to this particular data, and using your all-powerful login information to access REDCap data is far too powerful. What if your credentials fell into the wrong hands, because they were in a script on a drive that many people have access to? This is where API keys or tokens come in. API credentials give very specific access to very specific things, and can be regenerated easily, in case you suspect they may have been lost or misused. If API credentials do fall into the wrong hands, it’s not great, but it’s much better than accidentally sharing your CHOP username and password! Different APIs have different methods for giving you the key to your data or application, and different methods for supplying that credential when your program tries to make contact. Here, we’ll talk specifically about how REDCap handles this.

In REDCap, the project owner (or people with user rights) have to explicitly give permission to use the REDCap API. In fact, when you create a new REDCap project, REDCap does not provide you with these permissions, even though you’re the project owner. Take a look at your REDCap projects and look on the left hand side of the screen to look at the menu of options. Chances are, you won’t see anything that says “API”. If that’s the case, don’t worry. We’re going to take a project you own (or, you could quickly create a project and put some fake data in it) and walk you through how to give yourself API access.

Giving Yourself Access

Pick a REDCap database that belongs to you or in which you have the ability to change user rights. When you go into User Rights, look up your name (or your role) and edit the user rights. Give yourself both kinds of API access: API Export and API Import/Update, as shown in the image below.

Screenshot of REDCap User Rights

Then hit “Save Changes”. Refresh your browser (reload the page) so that your new permissions are included in what you get shown in your project. Now, on the left hand side of the project, you should see something new under “Applications” – “API” and “API Playground”.

Screenshot of REDCap Applications

So now you have the right to use the API, but you can’t start using it yet. You have to generate an API token. Click on the “API” application and then click on the button to “Generate API Token.”

Screenshot of API Token Request

The API token is unique to the combination of user and project. It’s a code that allows access only to the data in a single project and only the data that the person who generated the API token is allowed to see. Importantly, if you feel like you may have accidentally given your API token away, it’s a good idea to regenerate it, which is a single-click operation. It makes your old token invalid and creates a new token.

Soon, your API token will show up in that same “API” Application. It will look something like this (note, I’ve obscured my token – sharing this in a post like this one would be a terrible idea!).

Screenshot of API Token

You’ll use this token when you request data in your R or Python script.

Rehearsing your API Calls in the API Playground

So, you have some data in a REDCap database, and you have an API key that gives you access to that data. How do you use this key, though? Learning how to use an API can be tricky, because it requires you to pass not only the key, but your actual request (like “give me only the field beck_depression_inventory or “give me only the form called inclusion_criteria”). Luckily, REDCap provides you an “API Playground” that gives you not only a view into what the API can do, but provides a good start to giving you the actual code. Let’s take a deeper look.

The API Playground has several parts:

  • The menu-driven selection box
  • Raw Request Parameters
  • Response
  • Code Examples

Let’s look at how each part of the API Playground helps you learn the API.

In your REDCap project, head over to the “API Playground” – click on that phrase in the Applications pane on the left. At the top, you’ll see the menu-driven selection box. The first selection you have to make is API Method. For now, choose “Export Project Info”, and in the next two menus, choose “CSV”.

Screenshot API

The “Raw Request Parameters” box below your selection will change to reflect whatever you chose. Here’s an example of what you should see (note, I’ve obscured my actual key):

Screenshot API

This “Raw Request Parameters” gives you a quick look at the information you’re passing to REDCap, to make sure you selected what you really wanted in the menu part above.

Look further down the page, in the “Response” area, and click on the button that says “Execute Request”. You might get a “waiting” spinner, and then a box will appear below the button with the data that REDCap returned from your request.

You requested your data to be in a .csv, so you should get some data that’s “comma separated” – a bunch of fields that are separated by commas, with each new line of data being separated by a line break. In the “Response” box, all of this is presented in plain text, not in a table, so it might look confusing or overwhelming. If you want to, you can copy that plain text and paste it into a text editor, saving it with the .csv extension. That will allow you to then open it in Excel to see if the .csv is what you intended. Below, here’s the “before” (comma separated text in plain text) and the “after” (text saved as a .csv and then opened in Excel) of one of my own projects:

Screenshot of unformatted CSV

Screenshot of CSV in Excel

Below that box, you’ll see an HTTP status. You want that to be 200, which means no errors occurred.

This is an example of “trying out” the API without having to write code. You experiment with various methods, like “Export Records”, to get data, add the appropriate criteria, such as the form(s) you want to download, and set some parameters for how you want REDCap to give you that data back (for example, in .csv format, or in .json format). In some API methods (like “Export Records”, you’ll have several drop down menus you can choose from, including forms and fields. You can make multiple choices by using control or command + click, or, in some cases, if you want every item listed (say, you want every form and every field), you don’t click anything at all, and REDCap assumes you mean everything. Practice makes perfect! Go ahead and try out a few API calls using the Playground, including trying to access the data you want to pull into an analysis script. Once you feel comfortable with the various API methods and their output, go ahead to the next section.

Writing Code to Access Data

So, you’ve played around in the API Playground and you think you know the kind of API call you want to make. Now, how do you automate this selection so that your R or Python code will issue exactly the same request using code like you just did using menus? REDCap’s API playground comes to the rescue again, with code snippets that show how to write the code to do automatically what you just did manually.

Let’s take a look at some simple code snippets for a really easy REDCap API request – “Export Project Info.” That just gives me some metadata about a REDCap project, like the owner of the project and the project number. As before, I’ve chosen the API Method and set both the output and errors to be .csv files. When I do this, some code snippets are generated below. Let’s look at the Python and R versions (my token is obscured for obvious reasons!).

Here’s Python:

Screenshot of API code

And here’s what R looks like:

Screenshot of API code

With these code snippets, we’re most of the way there to data access. There’s a little bit of tweaking that we need to do, but not much!

Let’s try that code in an actual script, shall we?

Python

At the time of writing, the code that REDCap suggests for Python assumes a somewhat kludgy approach using Python 2, and we should be using Python 3 at this point, so I suggest a different approach using requests instead. Take the following template and replace text in two places: data = … should be replaced with whatever REDCap suggests for data = …, and the “https://redcap.chop.edu/api/” address should be replaced with whatever REDCap suggests as the REDCap url for APIs (take this from the setopt … URL line). It’ll probably be the same, but if it changes, trust what REDCap suggests.

Template:

import requests
data = {    
   # Some stuff here
}
r = requests.post(# Some URL here #, data)
r.text

So here’s the “before” code that REDCap provides, and the “after” code I propose:

Screenshot of Python Code

What you see as a result of this is just a string that you get back (similar to how you just saw the string in the “results” box when you were practicing in the API Playground). So, in a Python notebook, you might see something like this:

Screenshot of Python Results

How can we get that into a format we can work with? I suggest using pandas to make this string into a dataframe! We’ll use pandas and io to read that long string into a dataframe:

import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO(r.text))
df

And this is what we find:

Screenshot of pandas df

Now, you’re ready to work with the data! Obviously, in a more realistic situation, where you’re importing research records, you’ll have more than just one row like I do here.

R

The R Code provided by REDCap is almost perfect … here’s how to get a data frame in R from the data you’re pulling in from REDCap. Take the code provided by REDCap (you can optionally remove the first line, #!/usr/bin/env Rscript, if you’re working directly in R or RStudio) and this is what you get after issuing the print(result) command. Again, it’s the string that makes up the .csv:

Screenshot of R result

Just like in Python, we have to bring that string into the dataframe. To do that, we’ll replace the line that reads print(result) and instead add these lines:

con <- textConnection(result)
df <- read.csv(con)
df

This gives you a data frame in R that you can analyze! See below, where I use RStudio’s View command to make the data frame appear in an easy to read format in a source pane:

Screenshot of code

Screenshot of data frame

Now you know how to reach in directly to REDCap to get data in or about your project and bring it into R or Python!