Race/Ethnicity data collection standards

If you are working on an NIH-funded study, you’re already familiar with the annual reporting NIH requests on the breakdown of your research subjects’ race, ethnicity, and sex/gender. Even if your project is not NIH-funded, though, it’s a good idea to utilize the NIH standards for collecting race and ethnicity data. Standardizing data collection across projects makes it easier to compare data across those projects.

In the US, race and ethnicity data are collected separately. That means asking a subject to self-report the racial categories to which they consider themselves to belong, and then collecting their ethnic category as Hispanic/Latinx or not. While NIH offers standardized options to offer your research subjects for self-report, these are not necessarily the only options. You can get more granular if necessary, for instance in genetics research that might need to know greater detail about a subject’s ancestral origins, like asking additional follow-up questions about Hispanic/Lantix ancestry, or adding options for Ashkenazi Jewish or Anabaptist ancestry. At a minimum, race and ethnicity options suggested by the NIH should include:

Ethnicity (Select One)

  • Hispanic or Latino: A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race. The term “Spanish origin” can also be used in addition to “Hispanic or Latino.”
  • Not Hispanic or Latino

Race (Select All that Apply)

  • American Indian or Alaska Native: A person having origins in any of the original peoples of North, Central, or South America, and who maintains tribal affiliations or community attachment.
  • Asian: A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam. (Note: Individuals from the Philippine Islands have been recorded as Pacific Islanders in previous data collection strategies.)
  • Black or African American: A person having origins in any of the black racial groups of Africa.
  • Native Hawaiian or Other Pacific Islander: A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands.
  • White: A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.

Source: http://rethinkingclinicaltrials.org/resources/data-standards-for-recording-race-and-ethnicity/ Authors: Rachel Richesson, PhD and Michelle Smerek, BS on behalf of the NIH Collaboratory Phenotypes, Data Standards, and Data Quality Core. Version: 1.0, last updated May 1, 2014

Of course, these race and ethnicity categories may be different if you’re collecting data internationally. While the US government and many researchers collect ethnicity data as Hispanic/Latinx or Not, if you were collecting data in Sénégal your ethnicity data might instead include options like Wolof, Fula, Serer, Jola, etc. If you were collecting race breakdowns according to the Brazilian government’s categories, you might collect Multiracial, Black, White, Asian, Indigenous, or Undeclared. International variations in race and ethnicity reporting aside, for research based in the US, it is best to use the NIH data standards whenever possible.

Discover More!

Want to peruse other REDCap topics? Check out: