Joy Payton
Joy Payton
4 min read


Feasibility Analysis Using Arcus Cohort Discovery

How can you accelerate the work of determining whether a study on CHOP patients is feasible? Are there enough patients that fulfill your inclusion criteria to consider embarking on a study or applying for a grant? Let’s consider two scenarios: first, the current best practice of working with the Clinical Reporting Unit (CRU), a highly specialized data discovery group within DBHi, and next, beginning a feasibility analysis using the self-service offering that’s coming soon from Arcus, Arcus Cohort Discovery.

Using the CRU

Currently, researchers who need to request aggregate statistics on clinical data (for example, the creation of a cohort of potential or actual research subjects) must make a request via the Data Request Portal. Because the request is related to a research project, the request is routed to the Clinical Reporting Unit (CRU), a group within DBHi (the Department of Biomedical and Health Informatics). The CRU is made up of clinical data experts who know how to navigate the thousands of tables that make up the Epic reporting database (Clarity) and CHOP’s clinical data warehouse (CDW), which spans not just Epic data, but data from other systems as well, such as staffing data and legacy clinical data that was collected in a non-Epic system.

A Typical Feasibility Request

Mary is an orthopedic researcher who wants to do a retrospective analysis to examine the effect of sex and race on weight-bearing long bone fracture healing in a pediatric setting. Using the data request portal, she might make a request like, “Can you give me the numbers of males and females, grouped by race, who have had femur or tibia fractures and are currently between the ages of 12 and 16?” Mary needs these numbers to understand if her potential retrospective study would have sufficient statistical power. Or maybe she’s writing a grant to get this study funded, and she needs to describe the data available to her. Certainly she’ll need more data to conduct her research, like whether the breaks were compound, patient puberty status, whether surgery was required, etc. But Mary’s just starting out, and wants to find out if it’s even feasible to do this research before she commits to combing through lots of data.

In a few days, Mary gets the numbers back from the CRU. They are disappointing and Mary realizes she might need to relax some of her inclusion criteria – either expand the age range, or the bones she’s considering, or both. She sends an email to that effect to the CRU, and Mary and the CRU team member assigned to her work together to get a cohort that seems to work. Mary can then design her statistical analysis and request the data needed to do her research from the CRU.

Arcus Cohort Discovery

Compare that workflow with the use of the Arcus Cohort Discovery Tool, a new offering from Arcus that will become available to the research community at large in the next few months.

Arcus Cohort Discovery has access to de-identified, simplified clinical data representing the most commonly used fields in data requests – things like age, sex, race, ICD codes, and recency of care. This tool, currently in “alpha” – being tested internally for quality – will be launched in the coming year for the CHOP research community at large, and will allow you to do basic searches to identify cohorts (groups of patients) that share common features. Keep in mind that the data is de-identified, so this tool won’t be helpful for recruitment, but it will help you in feasibility analysis and research planning. Let’s see how this tool helps our hypothetical researcher.

Mary, our hypothetical researcher, has already done CITI training (a prerequisite for using this tool). She logs in to the Arcus website and agrees to the Arcus terms of use, the work of a minute or two. At this point, Mary can begin using this Cohort Discovery service on her own. Using this new tool, Mary puts in the ICD codes for the kinds of fractures she’s including in her cohort and adds an age range. She instantly gets aggregate counts back, information that allows her to judge what her next steps might be – expand the inclusion criteria, leave her inclusion criteria alone and request additional data on that cohort, or take up a different scientific question altogether. She is presented with a table that breaks down her cohort into its sex, ethnicity, and race components for easy inclusion into a research proposal or grant.

A new, exciting benefit of the Cohort Discovery Tool is that it also allows Mary to see if any of her potential cohort also has research data collected already from one of the labs that have agreed to collaborate with Arcus by contributing data. Mary notices that 20 of the patients in her long bone fracture cohort also participated in Jorge’s anorexia nervosa research study. That prompts her to reach out to Jorge to talk about how she might want to include BMI or eating disorder diagnoses in her research. The two agree to consider collaborating in the future, if Mary’s current research gets published, since there’s an interesting overlap in the populations they study.

Mary can also create a query based on the availability of biosamples. If Mary is interested, say, only in fracture patients for whom there are full exomes, she’s able to do that within Arcus Cohort Discovery. As a busy physician researcher with limited funding, Mary doesn’t have the resources to do sequencing herself, but she can use this tool to discover what’s already been done, so she can take advantage of previous research involving her cohort. In short, Mary gains autonomy over her own research planning, saves time, and is tipped off to potential collaborators, all through the use of a straightforward, self-service tool.

Arcus: More Data Within Your Reach

The goal of Arcus is to make more of CHOP’s data discoverable for our researchers by creating a computational research lab that puts more tools – and more data – within your reach. We aim to transform research at CHOP and create an environment where we’ve removed the barriers preventing reproducible, computational research on rich, cross-linked patient and research participant data.

While the Arcus Cohort Discovery Tool is no replacement for the skills and years of expertise of the CRU, it can help you get instant answers to the simpler questions that you used to have to wait days for. By using Arcus Cohort Discovery, you can establish base inclusion criteria to start a detailed, more specific data request with the CRU. Using the strengths of both the self-service Arcus Cohort Discovery tool and the expertise of the CRU, you can hone your cohort more quickly, which means you have more time to devote to scientific inquiry. We look forward to adding Arcus Cohort Discovery to your research toolset!