What is an API?
API stands for Application Programming Interface. It’s a way for people or machines to interact with software in a prescribed way. A common and fairly modern type of Web API is based on the REST architecture and are often referred to as “RESTful APIs”. The details of REST goes beyond the scope of this article, but in broad strokes we can define a RESTful API as one that promotes a “resource-oriented” API where URLs map to objects or resources that you can then interact with. It’s a simple request / response system that can be as simple or complex as needed. Some APIs will also allow a client to send data as well, for instance, to add to a database.
Generally, most well-crafted APIs will accept query parameters that include things like:
- What kind of data is being requested (e.g. what fields from a database)
- Which data items are being requested (e.g. by an identifier or resource name)
- The format requested (e.g. CSV or JSON)
- Any key and/or password required (e.g. IBM Watson or other for-pay or freemium APIs need to track how much of their services you’ve used)
- A data payload, if data is being uploaded
- Search terms, if applicable
- Ranges (like date ranges), if applicable
The API will either return the desired data in the desired format to the requesting (client) program, send a return code that means success (in the case, say, of adding data), or provide a descriptive and helpful error message. For an example of how an API works, a good resource is the “API playground” in REDCap. If you own a REDCap database, you can give yourself (or others) API access to both request or alter data, and play with the API by using menus in the browser to see the various options.
You’ll get output that shows the payload of the data you’d get back from that request. There’s a helpful section below the menus that shows how various API calls would be rendered in different kinds of code (like R, Python, or PHP).
This is how many APIs (like IBM Watson, ProPublica, Twitter) train users in how to use them – they provide a user friendly interface that doesn’t require much or any code up front while you learn them.
Why use APIs? They provide a structured, consistent way to carry out a process so that it can be automated and standardized. An API provides consistency around a process. Imagine two different people doing the exact same data download task using a manual approach. They will most definitely have a different process for doing the task and likely a different result as well. An API defines and requires a specific structure for input and provides a specific structure for the output.
Specific API use cases might include:
- Getting the latest, most up-to-date data for an application that counts hospital admissions for influenza
- Requesting, on a news site, only news stories related to public health in Ghana
- Communicating between code, such as sending success messages or heartbeats
In all of these cases, you want to get predictable results using a method that’s easy to reproduce. Let’s concentrate on the “getting fresh data” use of an API. While many data-centric applications allow you to download data by using a form submission and then save it to your computer, that might not be the most useful way to work with data in an ongoing way. If your data might change regularly, with more data being added, and if your data is small enough that it can be transmitted without much delay, it’s probably smart to add a few lines to your code that get the latest data, instead of depending on a potentially stale CSV in a folder on your computer. It’s also easier to make a reproducible script that runs and does everything from obtaining data to analyzing it and creating data visualizations, instead of relying on a punch list of instructions with manual steps like “download the data from …. Make sure you’ve checked the following boxes….”
Arcus is developing a number of APIs that will make it simpler to do things like:
- Describe a clinical cohort and request clinical data about that cohort
- Download the latest version of a catalogued dataset
- Request information about available data