Welcome, Toads!

On-ramp Materials
For the DART pilot project, all participating CHOP postdocs were invited to complete the following “on-ramp” of short modules to accelerate their ability to apply data science techniques to their research.
Name | Module Link | Current Version | Short Description |
---|---|---|---|
Learning to Learn | Training Course | 1.0.1 | Discover how learning data science is different than learning other subjects. |
Reproducibility | Training Course | 1.2.0 | This module provides learners with an approachable introduction to the concepts and impact of research reproducibility, generalizability, and data reuse, and how technical approaches can help make these goals more attainable. |
Directories and File Paths | Training Course | 1.1.0 | In this module, learners will explore what a directory is and how to describe the location of a file using its file path. |
Tidy Data | Training Course | 1.0.1 | Tidy is a technical term in data analysis and describes an optimal way for organizing data that will be analyzed computationally. |
Data Visualization in Open Source Software | Training Course | 1.0.0 | Introduction to principles of data vizualization and typical data vizualization workflows using two common open source libraries: ggplot2 and seaborn. |
Intro to Version Control | Training Course | 1.0.1 | An introduction to what version control systems do and why you might want to use one. |
R Basics: Introduction | Training Course | 1.0.0 | Introduction to R and hands-on first steps for brand new beginners. |
R Basics: Transform Data | Training Course | 1.0.0 | Learn how to transform (or wrangle) data using R’s dplyr package. |
R Basics: Visualize Data | Training Course | 1.0.0 | Learn how to visualize data using R’s ggplot2 package. |
Demystifying SQL | Training Course | 1.0.0 | SQL is a relational database solution that has been around for decades. Learn more about this technology at a high level, without having to write code. |
Data Visualization in ggplot2 | Training Course | 1.2.0 | This module includes code and explanations for several popular data visualizations, using R’s ggplot2 package. It also includes examples of how to modify ggplot2 plots to customize them for different uses (e.g. adhering to journal requirements for visualizations). |
Statistical Tests | Training Course | 1.0.1 | This module provides an overview of the most commonly used kinds of statistical tests and links to code for running many of them in both R and python. |
Phase Two
In the second phase of the DART pilot, the Knot of Toads Community of Practice is invited to complete the following series of short modules to continue their data science journey.
Name | Module Link | Current Version | Short Description |
---|---|---|---|
Bash / Command Line 101 | Training Course | 1.1.0 | This course will focus on accessing a command line program (or CLI, “command line interface”) and running shell scripts on your home computer and learning how to navigate your file system as well as editing and searching files. |
Bash / Command Line 102 | Training Course | 1.0.0 | This module will teach you how to use the bash shell to search and organize your files. |
Demystifying Python | Training Course | 1.0.0 | This module introduces the Python programming language, explores why Python is useful in research, and describes how to download Python and Jupyter. |
Python Basics: Writing Python Code | Training Course | 1.0.0 | Learn the foundations of writing Python code. |
Transform Data with pandas | Training Course | 1.0.0 | This is an introduction to transforming data using a Python library named pandas. |
Data Visualization in seaborn | Training Course | 1.0.0 | This module includes code and explanations for several popular data visualizations using Python’s seaborn library. It also includes examples of how to modify seaborn plots to customize them for different uses. |
Data Storage Models | Training Course | 1.0.0 | This course will focus on different data storage solutions available to an end user and the unique characteristics of each type. This course will also cover how each storage type impacts one’s access to data and computing capabilities. |
Setting Up Git in Mac and Linux | Training Course | 1.0.1 | This module provides recommendations and examples to help new users configure git on their computer for the first time on a Mac or Linux computer. |
Setting Up Git in Windows | Training Course | 1.0.1 | This module provides recommendations and examples to help new users configure Git on their Windows computer for the first time. |
Creating a Git Repository | Training Course | 1.0.0 | Create a new Git repository and get started with version control. |
Exploring the History of a Git Repository | Training Course | 1.0.0 | This module will teach you how to look at past versions of your work on Git and compare your project with previous versions. |
Omics Orientation | Training Course | 1.0.0 | This module provides a brief introduction to omics and its associated fields. |
Citing DART on your CV
The ATOP team was generous enough to put together some suggested language for you to use on your CV, or adapt as appropriate:
Participated in the inaugural NIH R25-funded CHOP Data and Analytics for Research Training (DART) pilot program
- Engaged in self-directed and cohort-based learning in data science techniques, including instruction in R, and SQL [add Bash, Git, Python, etc. as appropriate]
- Learned fundamentals of promoting rigor and reproducibility in research, including proper data management and version control
- Actively provided feedback on pilot program rollout to support future module users
Depending on your pathway and career goals, you may also want to emphasize a particular skill and how many modules you completed:
- Completed [XX] DART modules in [Data Visualization/Preferred Skill]
You should also be sure to include the topics you learned in the DART program under your Skills or Proficiencies section. Depending on your the modules you completed, these may include Bash, Git, R, Python, SQL, etc. It is also appropriate to list “data science” in that section in addition to the more specific skills.