CDSB Workshop 2020: Building workflows with RStudio and Bioconductor for single cell RNA-seq analysis

Community of Bioinformatics Software Developers

Summary

Join us for our 2020 workshop! This year we’ll teach you how to improve your skills for interacting with the R programming language with diverse strategies for organizing your code and projects. This will help you document your analyses such that they are easily to reproduce and for sharing them with your collaborators (from academia to industry). As a use case, we will learn the statistical tools needed for analyzing single cell transcriptomics (scRNA-seq) data using Bioconductor. Completing this workshop will help you in all your R projects and your analyses of biological data: all your analyses will benefit from the organization skills and the ideas behind scRNA-seq are used in many bioinformatics projects. Besides the CDSB instructors, thanks to Bioconductor this year we will count with Robert Amezquita, co-author of the book Orchestrating Single Cell Analysis with Bioconductor that was published by Nature Methods ( DOI), that is among the most publicized papers in 2020.

via GIPHY

Requirements

Requirements prior knowledge

  • Participants should have basic to intermediate knowledge of the R programming language: variable assignment, reading files: read.csv; data structures: matrix, data.frame, list; data types: character, numeric, factor, logical, etc; installation and use of packages.
  • Know how to use RStudio.
  • Be interested in learning good practices for organizing your work and sharing your work with others.
  • Be interested in learning how to analyze biological data using R/Bioconductor packages.

Technical requirements

  • Personal computer. Minimum 8GB RAM, a mouse and sufficient disk space for text files and image files. Administrator privileges to install and run utilities such as RStudio.

Overview

In recent years, R has become one of the most used programming languages for data science. The explosion in data available in many fields has increased the demand for data analysts, which is the case in Bioinformatics. R users start by learning how to use the tools others have openly shared with the international community. These R users acquire skills as they continue to analyze data and might even start to interact with R software developers through community websites such as RStudio Community, Bioconductor Support or via Twitter using the #rstats hashtag. Eventually some R users will want to write their own functions and organize their code across several projects. It is at this point that it’s useful to learn how to organize your code in order to make your life as an R programmer easier, such that you spend more time on your projects instead of remembering where your code is or what you did a few weeks ago. In order to practice these concepts, we will review the most recent methods for analyzing single cell RNA sequencing (scRNA-seq) data using R packages specialized for this goal that are freely available through Bioconductor.

The instructors of this workshop have participated at CDSB since its foundation and have gone to conferences such as BioC2019 and rstudio::conf(2020), among others. In recent years we taught how to make R and Bioconductor packages, which are of great use for sharing code with others. Recently, CDSB alumni sent their first R package to Bioconductor, which represented a huge percent increase of Latin American representation in the Bioconductor developers community, thus demonstrating that participating at a CDSB workshop has an impact beyond the one week workshop. For 2020 we will have an applied focus while maintaining our goals at CDSB which are:

  1. Turn (bioinformatics) software users into (bioinformatics) software developers.

  2. Foster the exchange of expertise and establish multidisciplinary collaborations.

  3. Create a community of Latin American scientists committed to the development of software and computational pipelines for (biological) data analysis.

  4. Help train users that can become local instructors and continue to grow their local communities.

Besides the CDSB instructors, thanks to Bioconductor this year we will count with Robert Amezquita, co-author of the book Orchestrating Single Cell Analysis with Bioconductor that was published by Nature Methods ( DOI), that is among the most publicized papers in 2020. That is why the CDSB 2020 workshop will be taught in English.

This workshop is part of a long-term project to create a community of developers from Latin America. We hope to hold regular meetings in the future (similar to BioC, EuroBioc and BioCAsia) where attendees present their own software contributions. To provide a welcoming environment please follow our code of conduct.

Program

5 days, 8 hours each of workshop plus breaks and meals. For the detailed schedule check the workshop repository cdsb2020.

Day 1

  • Workflow around RStudio projects:
    • Working with projects against scripts.
    • Creating a project.
    • Using safe paths.
    • How should I name my file?

Day 2

  • Using Git and GitHub.
  • CDSB community building activity.
  • Writing and documenting functions.
  • Debugging R code.

Day 3

  • Good practice for configuring and maintain workspaces.
  • Installing R packages from source.
  • General overview of single cell RNA-seq (scRNA-seq) data processing
  • RNA-seq against scRNA-seq: how different is the data?

Day 4

  • SingleCellExperiment R objects.
  • Exploratory data analysis of scRNA-seq data.
  • Dimension reduction methods.
  • Identifying gene cell markers.

Day 5

  • Classifying into cell types.
  • Batch effects in scRNA-seq experiments.
  • Differential analyses with scRNA-seq data (cell type proportions, differential expression, differential biological variation).

Instructors

Organizing committee

Code of Conduct

Sponsors

Become our sponsor

Platinum level

Gold level

Silver level

Organizers

CDSB is a node of the Mexican Bioinformatics Network (RMB in Spanish) and jointly organizes the yearly workshop with the National Node of Bioinformatics (NNB).

Avatar
CDSB
Community of Bioinformatics Software Developers

We want to help you acquire the skills to contribute open source Bioinformatics software using R

comments powered by Disqus