Avoid free text when you can (and you almost always can!)

Free text fields should be reserved for first and last names or notes that are meant for study management rather than analysis (e.g. “This participant doesn’t answer phone calls, only emails,” or “Prefers Friday morning visits.”) It’s easiest to analyze data that is structured and entered in standard formats. While you may sometimes need to collect free text as part of a survey (e.g., to have a respondent explain why they liked or disliked an experience), keep in mind that this unstructured data will not be analyzable without special training and software or hours to devote to coding Don’t think that means you can’t use text box fields in REDcap, however. Text box fields are not just for free text). REDCap text boxes are very frequently used with special restrictions in order to get structured data. The options for these data restrictions are Validation and Biomedical Ontology lookups. Make sure to utilize these special restrictions whenever you can in order to get the type of data you need for analysis.


The most commonly used text field validation options are:

  • Dates/datetimes
  • Email address
  • Numbers or integers
  • Phone numbers
  • Zipcodes

By choosing one of these validation options, you’re guaranteeing that the data entry person or survey respondent will enter data in the correct format. This saves you from hours of data cleaning before any analysis can take place.

REDCap text validation

Biomedical Ontologies:

Another structured data option for text fields is to use the Biomedical Ontology lookup. This allows you to connect to an outside ontology so that as you start typing a response, a standardized set of answers from the ontology is suggested for you to choose from.

REDCap Ontology

Some of the most popular ontologies are ICD-10 CM, RXNORM, and SNOMEDCT. Check these out - there are over 400 different options and some might surprise you (like the Zebrafish Anatomy & Development Ontology).

Multiple choice, checkbox, Yes-No, True-False, and Slider Fields:

All of these field types are considered structured data and, if set up well, will collect clean data for analysis. It’s common to have circumstances in which you can’t cover 100% of potential answer choices in a multiple choice question; that doesn’t mean you have to allow free text for that field You can still use multiple choice or checkbox fields as long as you offer an answer choice called “Other”. A safe bet is to cover about 80% of the possible answers within the multiple choice options, then collect the remaining 20% under the “Other” option. If someone chooses “other,” you can then follow up with a text box field using branching logic to ask “If other [medication, surgery, preferred Caribbean Island, etc.], please specify”.

Takeaway: Data meant for analysis should not normally be free text. There are ways to enforce structure and standardize your data entry, like using validation and ontologies (in order to make text fields more structured) and using non-text field types such as checkboxes. Use these methods so you can avoid expensive data cleaning.

