DART Modules for the Puddle of Platypodes Community of Practice

Welcome, Platypodes!

Forgive our change of platform while we work out some contract issues between CHOP and our vendor of choice. We hope this page is helpful in guiding you, even though it's not as fully featured as Thinkific. For now, we're hosting links to your modules here, on our CHOP website. Note that the DART (Data Analysis Researcher Training) modules we've created for you have a different (and better, we think!) format than other materials you might find this site. Feel free to look around our other resources, but please make use of the DART modules below, where we remind you of the on-ramp materials we offered in the first phase of the DART pilot, and then provide a new list of modules we're calling your phase two modules.

On-ramp Materials

For the DART pilot project, all participating CHOP postdocs were invited to complete the following “on-ramp” of short modules to accelerate their ability to apply data science techniques to their research.

Name Module Link Current Version Short Description
Learning to Learn Training Course 1.0.1 Discover how learning data science is different than learning other subjects.
Reproducibility Training Course 1.2.0 This module provides learners with an approachable introduction to the concepts and impact of research reproducibility, generalizability, and data reuse, and how technical approaches can help make these goals more attainable.
Directories and File Paths Training Course 1.1.0 In this module, learners will explore what a directory is and how to describe the location of a file using its file path.
Tidy Data Training Course 1.0.1 Tidy is a technical term in data analysis and describes an optimal way for organizing data that will be analyzed computationally.
Data Visualization in Open Source Software Training Course 1.0.0 Introduction to principles of data vizualization and typical data vizualization workflows using two common open source libraries: ggplot2 and seaborn.
Intro to Version Control Training Course 1.0.1 An introduction to what version control systems do and why you might want to use one.
R Basics: Introduction Training Course 1.0.0 Introduction to R and hands-on first steps for brand new beginners.
R Basics: Transform Data Training Course 1.0.0 Learn how to transform (or wrangle) data using R’s dplyr package.
R Basics: Visualize Data Training Course 1.0.0 Learn how to visualize data using R’s ggplot2 package.
Demystifying SQL Training Course 1.0.0 SQL is a relational database solution that has been around for decades. Learn more about this technology at a high level, without having to write code.
Data Visualization in ggplot2 Training Course 1.2.0 This module includes code and explanations for several popular data visualizations, using R’s ggplot2 package. It also includes examples of how to modify ggplot2 plots to customize them for different uses (e.g. adhering to journal requirements for visualizations).
Statistical Tests Training Course 1.0.1 This module provides an overview of the most commonly used kinds of statistical tests and links to code for running many of them in both R and python.

Phase Two

In the second phase of the DART pilot, the Puddle of Platypodes Community of Practice is invited to complete the following series of short modules to continue their data science journey.

Name LiaScript Link Current Version Short Description
Reshaping Data in R: Long and Wide Data Training Course 1.0.0 A module that teaches how to reshape tabular data in R, concentrating on some typical shapes known as “long” and “wide” data.
Data Storage Models Training Course 1.0.0 This course will focus on different data storage solutions available to an end user and the unique characteristics of each type. This course will also cover how each storage type impacts one’s access to data and computing capabilities.
Missing Values in R Training Course 1.0.0 A practical demonstration of how missing values show up in R and how to deal with them. Note that this module does not cover statistical approaches for handling missing data, but instead focuses on the code you need to find, work with, and assign missing values in R.
How to Troubleshoot Training Course 1.0.0 Learning to use technical methods like coding and version control in your research inevitably means running into problems. Learn practical methods for troubleshooting and moving past error codes and other difficulties.
Bash / Command Line 101 Training Course 1.1.0 This course will focus on accessing a command line program (or CLI, “command line interface”) and running shell scripts on your home computer and learning how to navigate your file system as well as editing and searching files.
Bash / Command Line 102 Training Course 1.0.0 This module will teach you how to use the bash shell to search and organize your files.
Setting Up Git in Mac and Linux Training Course 1.0.1 This module provides recommendations and examples to help new users configure git on their computer for the first time on a Mac or Linux computer.
Setting Up Git in Windows Training Course 1.0.1 This module provides recommendations and examples to help new users configure Git on their Windows computer for the first time.
Creating a Git Repository Training Course 1.0.0 Create a new Git repository and get started with version control.
Exploring the History of a Git Repository Training Course 1.0.0 This module will teach you how to look at past versions of your work on Git and compare your project with previous versions.
Omics Orientation Training Course 1.0.0 This module provides a brief introduction to omics and its associated fields.

Citing DART on your CV

The ATOP team was generous enough to put together some suggested language for you to use on your CV, or adapt as appropriate:

Participated in the inaugural NIH R25-funded CHOP Data and Analytics for Research Training (DART) pilot program

  • Engaged in self-directed and cohort-based learning in data science techniques, including instruction in R, and SQL [add Bash, Git, Python, etc. as appropriate]
  • Learned fundamentals of promoting rigor and reproducibility in research, including proper data management and version control
  • Actively provided feedback on pilot program rollout to support future module users

Depending on your pathway and career goals, you may also want to emphasize a particular skill and how many modules you completed:

  • Completed [XX] DART modules in [Data Visualization/Preferred Skill]

You should also be sure to include the topics you learned in the DART program under your Skills or Proficiencies section. Depending on your the modules you completed, these may include Bash, Git, R, Python, SQL, etc. It is also appropriate to list “data science” in that section in addition to the more specific skills.