Collaborative Research and Library Science
CHOP researchers are generating huge amounts of valuable research data and doing so at increasing rates. Currently, that data is largely siloed and difficult to find out about and access. One of the primary goals of the Arcus program is to provide coordinated discovery and access to this data, increasing the opportunities for new research by spurring new ideas, efforts, and collaboration. This new initiative will be the first time CHOP has provided these types of services at an institutional level. It will allow the CHOP community to leverage the benefits of reproducible, repurposable research in a way that has not so far been possible.
A key part addressing this challenge is a mechanism that helps researchers learn what data is available and how to access it. To address this, Arcus is building a data discovery catalog informed by library science principles and practices. The building of a catalog achieves many significant goals, all of which are guided by considerations for our end users. For example, the catalog indexes available datasets, software, and tools such that they can be searched for. It also maps and make actionable relationships, both explicit and implicit, within and across research projects, allowing researchers to identify new opportunities.
In order to meet more of CHOP’s research needs, we also are developing a formal archives program. The Arcus Archives will be able to manage datasets and provide access to important contextual documentation, such as the code used to create them. Most importantly, an archives program helps us provide framework to ensure that archived data is reliable and trustworthy, a major requirement for scientific rigor and the production of new knowledge. So, in addition to a catalog, the Arcus Archives has been established to ensure the the long-term availability, integrity, and authenticity of CHOP’s valuable research data.
The archives and catalog jointly address research data management and responsible exposure and retrieval. This avenue was chosen over other options because of the vast wealth of knowledge, research, and experience the archival field brings to these particular issues.
Libraries and Archives are two significant subfields of a larger field referred to as “information science” which concerns itself primarily with organization, preservation, and presentation of information. The main goal of libraries and archives is to provide as much access to as much material as possible for a given community of users while still responsibly maintaining the materials themselves. This applies to both physical and digital objects and can mean allowing full public access or restricting access to no one at all. Because determinations around appropriate levels of access and maintenance of materials will always be subjective, librarians and archivists are extensively trained in how to ethically and responsibly make these judgments in collaboration with partner communities. Library and archival principles offer the flexibility to provide strong guidance for how items should be preserved while allowing the professionals working with the objects to make the best decisions for the care of those objects.
A major challenge to sharing data in the biomedical and clinical research realms is navigating the complex and complicated privacy landscape. Patient privacy is paramount and must be at the forefront of biomedical research, but researchers often have additional privacy concerns such as confidential sponsors or competitive developments. Librarians and archivists are well trained to account for the specific privacy requirements that may be relevant to the data in their care, and the Library Science team brings to Arcus years of working with data while accounting for a varied array of privacy and security needs. The Library Science team works closely with the Arcus Data Privacy Analyst to build solutions for addressing privacy concerns into Arcus systems and processes.
Within the Arcus project, librarians and archivists also play a significant role by providing a meta perspective on the research being conducted. They serve as expert external observers whose primary focus is to care for research products, without the pressure and proximity of being directly involved in the conduct of the research effort. Where researchers serve as the subject matter experts for their scientific work, librarians and archivists serve as subject matter experts in organization, information seeking behavior, system design, data preservation, and more.
Archival practice in particular is concerned with preservation, a major facet of which is data integrity. Archivists ensure that the data in their archives are ingested, stored, and delivered unchanged. This is important for many reasons, chief among them the reproducibility of research, whether that be from individual researchers revisiting their own data or other researchers utilizing the data that was originally produced.
In the context of Arcus, archivists and librarians bring their extensive knowledge of information theory, classification systems, preservation methodology, and user experience and design to bear on the question: “How best should CHOP research data be selected, stored, and described so the work it represents can be fully and accurately understood and utilized by those who access it days, months, or even years later?”