CDSB Workshop 2020: Building workflows with RStudio and Bioconductor for single cell RNA-seq analysis

This workshop is part of the Mexican Bioinformatics Encounter (EBM in Spanish) 2020 organized by CDSB with:

TIB2020 RMB NNB CCG-UNAM

Community of Bioinformatics Software Developers (CDSB)

Summary

Join us for our 2020 workshop! This year we’ll teach you how to improve your skills for interacting with the R programming language with diverse strategies for organizing your code and projects. This will help you document your analyses such that they are easily to reproduce and for sharing them with your collaborators (from academia to industry). As a use case, we will learn the statistical tools needed for analyzing single cell transcriptomics (scRNA-seq) data using Bioconductor. Completing this workshop will help you in all your R projects and your analyses of biological data: all your analyses will benefit from the organization skills and the ideas behind scRNA-seq are used in many bioinformatics projects.

via GIPHY

Requirements

Requirements prior knowledge

  • Participants should have basic to intermediate knowledge of the R programming language: variable assignment, reading files: read.csv; data structures: matrix, data.frame, list; data types: character, numeric, factor, logical, etc; installation and use of packages.
  • Know how to use RStudio.
  • Be interested in learning good practices for organizing your work and sharing your work with others.
  • Be interested in learning how to analyze biological data using R/Bioconductor packages.

Technical requirements

  • Personal computer. Minimum 8GB RAM, a mouse and sufficient disk space for text files and image files. Administrator privileges to install and run utilities such as RStudio.

Overview

In recent years, R has become one of the most used programming languages for data science. The explosion in data available in many fields has increased the demand for data analysts, which is the case in Bioinformatics. R users start by learning how to use the tools others have openly shared with the international community. These R users acquire skills as they continue to analyze data and might even start to interact with R software developers through community websites such as RStudio Community, Bioconductor Support or via Twitter using the #rstats hashtag. Eventually some R users will want to write their own functions and organize their code across several projects. It is at this point that it’s useful to learn how to organize your code in order to make your life as an R programmer easier, such that you spend more time on your projects instead of remembering where your code is or what you did a few weeks ago. In order to practice these concepts, we will review the most recent methods for analyzing single cell RNA sequencing (scRNA-seq) data using R packages specialized for this goal that are freely available through Bioconductor.

The instructors of this workshop have participated at CDSB since its foundation and have gone to conferences such as BioC2019 and rstudio::conf(2020), among others. In recent years we taught how to make R and Bioconductor packages, which are of great use for sharing code with others. Recently, CDSB alumni sent their first R package to Bioconductor, which represented a huge percent increase of Latin American representation in the Bioconductor developers community, thus demonstrating that participating at a CDSB workshop has an impact beyond the one week workshop. For 2020 we will have an applied focus while maintaining our goals at CDSB which are:

  1. Turn (bioinformatics) software users into (bioinformatics) software developers.

  2. Foster the exchange of expertise and establish multidisciplinary collaborations.

  3. Create a community of Latin American scientists committed to the development of software and computational pipelines for (biological) data analysis.

  4. Help train users that can become local instructors and continue to grow their local communities.

The scRNA-seq portion of the workshop will be based on the book Orchestrating Single Cell Analysis with Bioconductor that was published by Nature Methods ( DOI) and is among the most publicized papers in 2020.

This workshop is part of a long-term project to create a community of developers from Latin America. We hope to hold regular meetings in the future (similar to BioC, EuroBioc and BioCAsia) where attendees present their own software contributions. To provide a welcoming environment please follow our code of conduct.

Program

9 am to 5:30 pm on Mexico’s central time zone (Friday we end at 2:30 pm) with breaks and time to eat. The detailed schedule and the Zoom links will be provided to those participants that register for the event through the private CDSB Google Calendar. For more schedule details check the CDSB2020 workshop GitHub repository.

Every day we will have a help session from 8 to 9 am for those that need help installing the required software for the workshop.

Day 1

  • EBM2020 inauguration
  • Welcome to CDSB
  • Participants self introductions
  • Workflow around RStudio projects:
    • Introduction to the project-oriented workflow.
    • Working with projects against scripts.
    • Creating a project.
    • Using safe paths.
    • How should I name my file?

Day 2

  • Using Git and GitHub.
  • Modifying the R startup files.
  • Writing and documenting functions.
  • Debugging R code.

Day 3

  • Good practice for configuring and maintain workspaces.
  • Remote picture / video.
  • Installing R packages from source.
  • General overview of single cell RNA-seq (scRNA-seq) data processing
  • Community-building activities
  • Overview of the scRNA-seq material

Day 4

  • Introduction to scRNA-seq
  • Introduction to scRNA-seq with Bioconductor
  • Data infrastructure and data import
  • Quality control
  • Data normalization

Day 5

  • Feature selection
  • Dimension reduction
  • Clustering and differential gene expression analysis
  • spatialLIBD: analyzing data from the Visium assay by 10x Genomics
  • Workshop evaluation
  • Closing ceremony and CDSB reminders

Instructors

Organizing committee

Code of Conduct

Sponsors

Become our sponsor

Platinum level

Gold level

Silver level

Organizers

CDSB is a node of the Mexican Bioinformatics Network (RMB in Spanish) and jointly organizes the yearly workshop with the National Node of Bioinformatics (NNB).

With support from:

Avatar
CDSB
Community of Bioinformatics Software Developers

We want to help you acquire the skills to contribute open source Bioinformatics software using R

comments powered by Disqus