Welcome, Bats!

On-ramp Materials
For the DART pilot project, all participating CHOP postdocs were invited to complete the following “on-ramp” of short modules to accelerate their ability to apply data science techniques to their research.
Name | Module Link | Current Version | Short Description |
---|---|---|---|
Learning to Learn | Training Course | 1.0.1 | Discover how learning data science is different than learning other subjects. |
Reproducibility | Training Course | 1.2.0 | This module provides learners with an approachable introduction to the concepts and impact of research reproducibility, generalizability, and data reuse, and how technical approaches can help make these goals more attainable. |
Directories and File Paths | Training Course | 1.1.0 | In this module, learners will explore what a directory is and how to describe the location of a file using its file path. |
Tidy Data | Training Course | 1.0.1 | Tidy is a technical term in data analysis and describes an optimal way for organizing data that will be analyzed computationally. |
Data Visualization in Open Source Software | Training Course | 1.0.0 | Introduction to principles of data vizualization and typical data vizualization workflows using two common open source libraries: ggplot2 and seaborn. |
Intro to Version Control | Training Course | 1.0.1 | An introduction to what version control systems do and why you might want to use one. |
R Basics: Introduction | Training Course | 1.0.0 | Introduction to R and hands-on first steps for brand new beginners. |
R Basics: Transform Data | Training Course | 1.0.0 | Learn how to transform (or wrangle) data using R’s dplyr package. |
R Basics: Visualize Data | Training Course | 1.0.0 | Learn how to visualize data using R’s ggplot2 package. |
Demystifying SQL | Training Course | 1.0.0 | SQL is a relational database solution that has been around for decades. Learn more about this technology at a high level, without having to write code. |
Data Visualization in ggplot2 | Training Course | 1.2.0 | This module includes code and explanations for several popular data visualizations, using R’s ggplot2 package. It also includes examples of how to modify ggplot2 plots to customize them for different uses (e.g. adhering to journal requirements for visualizations). |
Statistical Tests | Training Course | 1.0.1 | This module provides an overview of the most commonly used kinds of statistical tests and links to code for running many of them in both R and python. |
Phase Two
In the second phase of the DART pilot, the Cauldron of Bats Community of Practice is invited to complete the following series of short modules to continue their data science journey.
Name | Module Link | Current Version | Short Description |
---|---|---|---|
Bash / Command Line 101 | Training Course | 1.1.0 | This course will focus on accessing a command line program (or CLI, “command line interface”) and running shell scripts on your home computer and learning how to navigate your file system as well as editing and searching files. |
Bash / Command Line 102 | Training Course | 1.0.0 | This module will teach you how to use the bash shell to search and organize your files. |
Data Storage Models | Training Course | 1.0.0 | This course will focus on different data storage solutions available to an end user and the unique characteristics of each type. This course will also cover how each storage type impacts one’s access to data and computing capabilities. |
Using the REDCap API | Training Course | 1.0.0 | REDCap is a research data capture tool used by many researchers in basic, translational, and clinical research efforts. Learn how to use the REDCap API in this module. |
SQL Basics | Training Course | 1.0.0 | Structured Query Language, or SQL, is a relational database solution that has been around for decades. Learn how to do basic SQL queries on single tables, by using code, hands-on. |
SQL Intermediate | Training Course | 1.0.0 | Learn how to do intermediate SQL queries on single tables, by using code, hands-on. |
How to Troubleshoot | Training Course | 1.0.0 | Learning to use technical methods like coding and version control in your research inevitably means running into problems. Learn practical methods for troubleshooting and moving past error codes and other difficulties. |
Setting Up Git in Mac and Linux | Training Course | 1.0.1 | This module provides recommendations and examples to help new users configure git on their computer for the first time on a Mac or Linux computer. |
Setting Up Git in Windows | Training Course | 1.0.1 | This module provides recommendations and examples to help new users configure Git on their Windows computer for the first time. |
Creating a Git Repository | Training Course | 1.0.0 | Create a new Git repository and get started with version control. |
Exploring the History of a Git Repository | Training Course | 1.0.0 | This module will teach you how to look at past versions of your work on Git and compare your project with previous versions. |
Omics Orientation | Training Course | 1.0.0 | This module provides a brief introduction to omics and its associated fields. |
Citing DART on your CV
The ATOP team was generous enough to put together some suggested language for you to use on your CV, or adapt as appropriate:
Participated in the inaugural NIH R25-funded CHOP Data and Analytics for Research Training (DART) pilot program
- Engaged in self-directed and cohort-based learning in data science techniques, including instruction in R, and SQL [add Bash, Git, Python, etc. as appropriate]
- Learned fundamentals of promoting rigor and reproducibility in research, including proper data management and version control
- Actively provided feedback on pilot program rollout to support future module users
Depending on your pathway and career goals, you may also want to emphasize a particular skill and how many modules you completed:
- Completed [XX] DART modules in [Data Visualization/Preferred Skill]
You should also be sure to include the topics you learned in the DART program under your Skills or Proficiencies section. Depending on your the modules you completed, these may include Bash, Git, R, Python, SQL, etc. It is also appropriate to list “data science” in that section in addition to the more specific skills.