CDSB Workshop 2020: Building workflows with RStudio and Bioconductor for single cell RNA-seq analysis
Community of Bioinformatics Software Developers
Level: intermediate - advanced
When: August 3 - August 7, 2020
Registration (Will open on April XX, 2020)
Join us for our 2020 workshop! This year we’ll teach you how to improve your skills for interacting with the R programming language with diverse strategies for organizing your code and projects. This will help you document your analyses such that they are easily to reproduce and for sharing them with your collaborators (from academia to industry). As a use case, we will learn the statistical tools needed for analyzing single cell transcriptomics (scRNA-seq) data using Bioconductor. Completing this workshop will help you in all your R projects and your analyses of biological data: all your analyses will benefit from the organization skills and the ideas behind scRNA-seq are used in many bioinformatics projects. Besides the CDSB instructors, thanks to Bioconductor this year we will count with Robert Amezquita, co-author of the book Orchestrating Single Cell Analysis with Bioconductor that was published by Nature Methods ( DOI), that is among the most publicized papers in 2020.
Requirements prior knowledge
- Participants should have basic to intermediate knowledge of the R programming language: variable assignment, reading files:
read.csv; data structures:
list; data types:
logical, etc; installation and use of packages.
- Know how to use RStudio.
- Be interested in learning good practices for organizing your work and sharing your work with others.
- Be interested in learning how to analyze biological data using R/Bioconductor packages.
- Personal computer. Minimum 8GB RAM, a mouse and sufficient disk space for text files and image files. Administrator privileges to install and run utilities such as RStudio.
In recent years,
R has become one of the most used programming languages for data science. The explosion in data available in many fields has increased the demand for data analysts, which is the case in Bioinformatics.
R users start by learning how to use the tools others have openly shared with the international community. These
R users acquire skills as they continue to analyze data and might even start to interact with
R software developers through community websites such as
Bioconductor Support or via Twitter using the
#rstats hashtag. Eventually some
R users will want to write their own functions and organize their code across several projects. It is at this point that it’s useful to learn how to organize your code in order to make your life as an R programmer easier, such that you spend more time on your projects instead of remembering where your code is or what you did a few weeks ago. In order to practice these concepts, we will review the most recent methods for analyzing single cell RNA sequencing (scRNA-seq) data using R packages specialized for this goal that are freely available through
The instructors of this workshop have participated at CDSB since its foundation and have gone to conferences such as BioC2019 and rstudio::conf(2020), among others. In recent years we taught how to make R and Bioconductor packages, which are of great use for sharing code with others. Recently, CDSB alumni sent their first R package to Bioconductor, which represented a huge percent increase of Latin American representation in the Bioconductor developers community, thus demonstrating that participating at a CDSB workshop has an impact beyond the one week workshop. For 2020 we will have an applied focus while maintaining our goals at CDSB which are:
Turn (bioinformatics) software users into (bioinformatics) software developers.
Foster the exchange of expertise and establish multidisciplinary collaborations.
Create a community of Latin American scientists committed to the development of software and computational pipelines for (biological) data analysis.
Help train users that can become local instructors and continue to grow their local communities.
Besides the CDSB instructors, thanks to Bioconductor this year we will count with Robert Amezquita, co-author of the book Orchestrating Single Cell Analysis with Bioconductor that was published by Nature Methods ( DOI), that is among the most publicized papers in 2020. That is why the CDSB 2020 workshop will be taught in English.
This workshop is part of a long-term project to create a community of developers from Latin America. We hope to hold regular meetings in the future (similar to BioC, EuroBioc and BioCAsia) where attendees present their own software contributions. To provide a welcoming environment please follow our code of conduct.
5 days, 8 hours each of workshop plus breaks and meals. For the detailed schedule check the workshop repository cdsb2020.
- Workflow around RStudio projects:
- Working with projects against scripts.
- Creating a project.
- Using safe paths.
- How should I name my file?
- Using Git and GitHub.
- CDSB community building activity.
- Writing and documenting functions.
- Debugging R code.
- Good practice for configuring and maintain workspaces.
- Installing R packages from source.
- General overview of single cell RNA-seq (scRNA-seq) data processing
- RNA-seq against scRNA-seq: how different is the data?
- SingleCellExperiment R objects.
- Exploratory data analysis of scRNA-seq data.
- Dimension reduction methods.
- Identifying gene cell markers.
- Classifying into cell types.
- Batch effects in scRNA-seq experiments.
- Differential analyses with scRNA-seq data (cell type proportions, differential expression, differential biological variation).
Joselyn Chávez (IBT-UNAM, Cuernavaca, Morelos, Mexico). Joselyn presented her work at BioC2019 thanks to a travel award, attended rstudio::conf(2020) thanks to a diversity scholarship where she took the What They Forgot to Teach You about R workshop, and recently was part of the first
Rpackage submission to Bioconductor by CDSB alumni. She initially was a CDSB2018 student.
Leonardo Collado-Torres (Lieber Institute for Brain Development, Baltimore, MD, USA). Leonardo recently published a pre-print on spatial transcriptomics using 10xGenomics Visium data.
Robert Amezquita (Fred Hutchinson Cancer Research Center, Seattle, WA, USA) Lead author of the book Orchestrating Single Cell Analysis with Bioconductor that was published by Nature Methods ( DOI), that is among the most publicized papers in 2020. Robert taught a workshop on this topic at BioC2019.
Alejandra Medina-Rivera (International Laboratory for Human Genome Research, Juriquilla, Querétaro, Mexico)
Alejandro Reyes (Dana-Farber Cancer Institute, Boston, MA, USA)
Joselyn Chávez (IBT-UNAM, Cuernavaca, Morelos, Mexico)
Leonardo Collado-Torres (Lieber Institute for Brain Development, Baltimore, MD, USA)