class: center, middle, inverse, title-slide #
Introducción
##
Bioconductor
para datos transcriptómicos de célula única (
scRNA-seq
) –
CDSB2020
###
Leonardo Collado-Torres
### 2020-08-05 --- class: inverse .center[ <a href="https://osca.bioconductor.org/"><img src="https://raw.githubusercontent.com/Bioconductor/OrchestratingSingleCellAnalysis-release/master/images/cover.png" style="width: 30%"/></a> <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>. <a href='https://clustrmaps.com/site/1b5pl' title='Visit tracker'><img src='//clustrmaps.com/map_v2.png?cl=ffffff&w=150&t=n&d=rP3KLyAMuzVNcJFL-_C-B0XnLNVy8Sp6a8HDaKEnSzc'/></a> ] .footnote[ Download the materials for this course with `usethis::use_course('comunidadbioinfo/cdsb2020')` or view online at [**comunidadbioinfo.github.io/cdsb2020**](http://comunidadbioinfo.github.io/cdsb2020).] <style type="text/css"> /* From https://github.com/yihui/xaringan/issues/147 */ .scroll-output { height: 80%; overflow-y: scroll; } /* https://stackoverflow.com/questions/50919104/horizontally-scrollable-output-on-xaringan-slides */ pre { max-width: 100%; overflow-x: scroll; } /* From https://github.com/yihui/xaringan/wiki/Font-Size */ .tiny{ font-size: 40% } /* From https://github.com/yihui/xaringan/wiki/Title-slide */ .title-slide { background-image: url(https://raw.githubusercontent.com/Bioconductor/OrchestratingSingleCellAnalysis/master/images/Workflow.png); background-size: 33%; background-position: 0% 100% } </style> --- # Orígenes del curso -- * El libro [**Orchestrating Single Cell Analysis With Bioconductor**](ohttps://osca.bioconductor.org/) escrito por [Aaron Lun](https://www.linkedin.com/in/aaron-lun-869b5894/), [Robert Amezquita](https://robertamezquita.github.io/), [Stephanie Hicks](https://www.stephaniehicks.com/) y [Raphael Gottardo](http://rglab.org) -- Amezquita, R.A., Lun, A.T.L., Becht, E. et al. Orchestrating single-cell analysis with Bioconductor. _Nat Methods_ 17, 137–145 (2020). DOI: [10.1038/s41592-019-0654-x](https://doi.org/10.1038/s41592-019-0654-x) -- * [**Curso de scRNA-seq de WEHI**](https://drive.google.com/drive/folders/1cn5d-Ey7-kkMiex8-74qxvxtCQT6o72h) creado por [Peter Hickey](https://www.peterhickey.org/) --- class: center, middle # Instructor **Leonardo Collado-Torres** <img src="http://lcolladotor.github.io/authors/admin/avatar_hub730ffb954e879fe0ab174cacb839b41_1326712_270x270_fill_lanczos_center_2.png" /> * Sitio web: [lcolladotor.github.io](http://lcolladotor.github.io) * Twitter: [fellgernon](https://twitter.com/fellgernon) --- background-image: url(img/01-intro/Slide1.png) background-size: 100% --- background-image: url(img/01-intro/Slide2.png) background-size: 100% --- background-image: url(img/01-intro/Slide3.png) background-size: 100% --- background-image: url(img/01-intro/Slide4.png) background-size: 100% --- # Pre-requisitos .scroll-output[ Instala R 4.0.x desde [CRAN](https://cran.r-project.org/) y luego instala los siguientes paquetes de R: ```r ## Para instalar paquetes de Bioconductor if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") ## Instala los paquetes de R que necesitamos BiocManager::install( c( 'SingleCellExperiment', 'usethis', 'here', 'scran', 'scater', 'scRNAseq', 'org.Mm.eg.db', 'AnnotationHub', 'ExperimentHub', 'BiocFileCache', 'DropletUtils', 'EnsDb.Hsapiens.v86', 'TENxPBMCData', 'BiocSingular', 'batchelor', 'uwot', 'Rtsne', 'pheatmap', 'fossil', 'ggplot2', 'cowplot', 'RColorBrewer', 'plotly', 'iSEE', 'pryr', 'spatialLIBD', 'sessioninfo', 'scPipe' ) ) ``` Tabmién debes instalar [RStudio](https://rstudio.com/products/rstudio/download/#download) versión 1.3.1 o la más nueva de preferencia. ] --- # Ayudantes del curso * [Ana Betty Villaseñor Altamirano](https://twitter.com/AnaBetty2304) * [Aurora Labastida](https://twitter.com/alabasti1) * [Carlos Gabriel Aguilar Pérez](https://twitter.com/Carlos_Aplysia) --- # Materiales del curso -- * Descargalos con `usethis::use_course('comunidadbioinfo/cdsb2020')` -- * Checalos en línea a través de [**comunidadbioinfo.github.io/cdsb2020**](http://comunidadbioinfo.github.io/cdsb2020) -- * **Clona** el repositorio de GitHub, lo cual te hará más fácil el poder actualizar tus archivos con la versión más reciente usando *git pull* ```bash ## Si tienes las llaves de SSH configuradas git clone git@github.com:comunidadbioinfo/cdsb2020.git ## o vía https git clone https://github.com/comunidadbioinfo/cdsb2020.git ``` Desde R: ```r git2r::clone('https://github.com/comunidadbioinfo/cdsb2020', 'csdb2020') ``` --- # Crea tu propio proyecto Te recomiendo que crees tu propio proyecto y actives el control de versiones ```r usethis::create_project('~/Desktop/cdsb2020_leo') ``` ```r ## Crea un archivo de setup usethis::use_r('00-setup.R') ``` --- En este script de setup, guarda los siguientes comandos ```r ## Crea el repositorio de Git usethis::use_git() ## Configura tu conexión a GitHub de ser necesario usethis::browse_github_token() usethis::edit_r_environ() ## y después reinicia R ## Utiliza GitHub usethis::use_github() ## crea un commit, luego corre este comando ## Empieza tus notas sobre la introducción a scRNA-seq usethis::use_r('01-introduction.R') ``` Ve el ejemplo en [**cdsb2020_leo**](https://github.com/lcolladotor/cdsb2020_leo) --- # Pedir ayuda -- * Usa la manita azul de Zoom: "raise your **hand**" -- * Crea un **issue** en [cdsb2020](https://github.com/comunidadbioinfo/cdsb2020/issues). ¡Acuérdate de incluir un ejemplo reproducible! -- * De forma más general, a través del [**Bioconductor Support Website**](https://support.bioconductor.org/) etiquetando el paquete de R relacionado. -- * Algunos blog posts relacionados: [**How to ask for help for Bioconductor packages**](http://lcolladotor.github.io/2017/03/06/how-to-ask-for-help-for-bioconductor-packages/#.XnjLRNNKh0s), [**Asking for help is challenging but is typically worth it**](http://lcolladotor.github.io/2018/11/12/asking-for-help-is-challenging-but-is-typically-worth-it/#.XnjLf9NKh0s), y [**Learning from our search history**](http://lcolladotor.github.io/2020/02/12/learning-from-our-search-history/) -- * La plática de [Jenny Bryan](https://twitter.com/JennyBryan) para `rstudio::conf(2020)`: [**Object of type ‘closure’ is not subsettable**](https://resources.rstudio.com/rstudio-conf-2020/object-of-type-closure-is-not-subsettable-jenny-bryan) --- background-image: url(https://raw.githubusercontent.com/Bioconductor/OrchestratingSingleCellAnalysis-release/master/images/cover.png) background-size: contain --- background-image: url(https://raw.githubusercontent.com/Bioconductor/OrchestratingSingleCellAnalysis-release/master/images/SingleCellExperiment.png) background-size: contain --- background-image: url(https://raw.githubusercontent.com/Bioconductor/OrchestratingSingleCellAnalysis-release/master/images/Workflow.png) background-size: contain --- background-image: url(http://research.libd.org/spatialLIBD/reference/figures/README-access_data-1.png) background-size: contain --- # Introdución rápida: [OSCA](https://osca.bioconductor.org/overview.html#quick-start) ```r library('scRNAseq') library('scater') library('scran') library('plotly') ``` --- .scroll-output[ ```r sce <- scRNAseq::MacoskoRetinaData() ``` ``` ## snapshotDate(): 2020-04-27 ``` ``` ## see ?scRNAseq and browseVignettes('scRNAseq') for documentation ``` ``` ## loading from cache ``` ``` ## see ?scRNAseq and browseVignettes('scRNAseq') for documentation ``` ``` ## loading from cache ``` ```r ## ¿Qué tan grandes son los datos? pryr::object_size(sce) ``` ``` ## Registered S3 method overwritten by 'pryr': ## method from ## print.bytes Rcpp ``` ``` ## 461 MB ``` ```r ## ¿Cómo es el objeto? sce ``` ``` ## class: SingleCellExperiment ## dim: 24658 49300 ## metadata(0): ## assays(1): counts ## rownames(24658): KITL TMTC3 ... 1110059M19RIK GM20861 ## rowData names(0): ## colnames(49300): r1_GGCCGCAGTCCG r1_CTTGTGCGGGAA ... p1_TAACGCGCTCCT ## p1_ATTCTTGTTCTT ## colData names(2): cell.id cluster ## reducedDimNames(0): ## altExpNames(0): ``` ] --- .scroll-output[ ```r # Control de calidad. es.mito <- grepl("^MT-", rownames(sce)) qcstats <- scater::perCellQCMetrics(sce, subsets = list(Mito = es.mito)) filtered <- scater::quickPerCellQC(qcstats, percent_subsets = "subsets_Mito_percent") sce <- sce[, !filtered$discard] # Normalización. sce <- scater::logNormCounts(sce) # Selección de genes. dec <- scran::modelGeneVar(sce) hvg <- scran::getTopHVGs(dec, prop = 0.1) # Reducción de dimensiones. set.seed(1234) sce <- scater::runPCA(sce, ncomponents = 25, subset_row = hvg) sce <- scater::runUMAP(sce, dimred = 'PCA', external_neighbors = TRUE) # Clustering. g <- scran::buildSNNGraph(sce, use.dimred = 'PCA') sce$clusters <- factor(igraph::cluster_louvain(g)$membership) ``` ] --- ```r # Visualización. scater::plotUMAP(sce, colour_by = "clusters") ``` ![](01-introduction_files/figure-html/quick_intro_04-1.png)<!-- --> --- ```r # Visualización interactiva. p <- scater::plotUMAP(sce, colour_by = "clusters") plotly::ggplotly(p) ``` --- class: middle .center[ # ¡Gracias! Las diapositivias fueron hechas con el paquete de R [**xaringan**](https://github.com/yihui/xaringan) y configuradas con [**xaringanthemer**](https://github.com/gadenbuie/xaringanthemer). Este curso está basado en el libro [**Orchestrating Single Cell Analysis with Bioconductor**](https://osca.bioconductor.org/) de [Aaron Lun](https://www.linkedin.com/in/aaron-lun-869b5894/), [Robert Amezquita](https://robertamezquita.github.io/), [Stephanie Hicks](https://www.stephaniehicks.com/) y [Raphael Gottardo](http://rglab.org), además del [**curso de scRNA-seq para WEHI**](https://drive.google.com/drive/folders/1cn5d-Ey7-kkMiex8-74qxvxtCQT6o72h) creado por [Peter Hickey](https://www.peterhickey.org/). Puedes encontrar los archivos para este taller en [comunidadbioinfo/cdsb2020](https://github.com/comunidadbioinfo/cdsb2020). Instructor: [**Leonardo Collado-Torres**](http://lcolladotor.github.io/). <a href="https://www.libd.org"><img src="img/LIBD_logo.jpg" style="width: 20%" /></a> ] .footnote[Descarga los materiales con `usethis::use_course('comunidadbioinfo/cdsb2020')` o revisalos en línea vía [**comunidadbioinfo.github.io/cdsb2020**](http://comunidadbioinfo.github.io/cdsb2020).] --- # Detalles de la sesión de R .scroll-output[ .tiny[ ```r options(width = 120) sessioninfo::session_info() ``` ``` ## ─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────── ## setting value ## version R version 4.0.2 (2020-06-22) ## os macOS Catalina 10.15.5 ## system x86_64, darwin17.0 ## ui X11 ## language (EN) ## collate en_US.UTF-8 ## ctype en_US.UTF-8 ## tz America/New_York ## date 2020-08-02 ## ## ─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── ## package * version date lib source ## AnnotationDbi 1.50.3 2020-07-25 [1] Bioconductor ## AnnotationHub 2.20.0 2020-04-27 [1] Bioconductor ## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) ## beeswarm 0.2.3 2016-04-25 [1] CRAN (R 4.0.0) ## Biobase * 2.48.0 2020-04-27 [1] Bioconductor ## BiocFileCache 1.12.0 2020-04-27 [1] Bioconductor ## BiocGenerics * 0.34.0 2020-04-27 [1] Bioconductor ## BiocManager 1.30.10 2019-11-16 [1] CRAN (R 4.0.0) ## BiocNeighbors 1.6.0 2020-04-27 [1] Bioconductor ## BiocParallel 1.22.0 2020-04-27 [1] Bioconductor ## BiocSingular 1.4.0 2020-04-27 [1] Bioconductor ## BiocVersion 3.11.1 2020-04-07 [1] Bioconductor ## bit 1.1-15.2 2020-02-10 [1] CRAN (R 4.0.0) ## bit64 0.9-7.1 2020-07-15 [1] CRAN (R 4.0.2) ## bitops 1.0-6 2013-08-17 [1] CRAN (R 4.0.0) ## blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.0) ## cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) ## codetools 0.2-16 2018-12-24 [1] CRAN (R 4.0.2) ## colorout * 1.2-2 2020-03-16 [1] Github (jalvesaq/colorout@726d681) ## colorspace 1.4-1 2019-03-18 [1] CRAN (R 4.0.0) ## cowplot 1.0.0 2019-07-11 [1] CRAN (R 4.0.0) ## crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) ## curl 4.3 2019-12-02 [1] CRAN (R 4.0.0) ## data.table 1.13.0 2020-07-24 [1] CRAN (R 4.0.2) ## DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.0) ## dbplyr 1.4.4 2020-05-27 [1] CRAN (R 4.0.2) ## DelayedArray * 0.14.1 2020-07-14 [1] Bioconductor ## DelayedMatrixStats 1.10.1 2020-07-03 [1] Bioconductor ## digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) ## dplyr 1.0.0 2020-05-29 [1] CRAN (R 4.0.2) ## dqrng 0.2.1 2019-05-17 [1] CRAN (R 4.0.0) ## edgeR 3.30.3 2020-06-02 [1] Bioconductor ## ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0) ## evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) ## ExperimentHub 1.14.0 2020-04-27 [1] Bioconductor ## fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) ## farver 2.0.3 2020-01-16 [1] CRAN (R 4.0.0) ## fastmap 1.0.1 2019-10-08 [1] CRAN (R 4.0.0) ## generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0) ## GenomeInfoDb * 1.24.2 2020-06-15 [1] Bioconductor ## GenomeInfoDbData 1.2.3 2020-04-16 [1] Bioconductor ## GenomicRanges * 1.40.0 2020-04-27 [1] Bioconductor ## ggbeeswarm 0.6.0 2017-08-07 [1] CRAN (R 4.0.0) ## ggplot2 * 3.3.2 2020-06-19 [1] CRAN (R 4.0.2) ## glue 1.4.1 2020-05-13 [1] CRAN (R 4.0.0) ## gridExtra 2.3 2017-09-09 [1] CRAN (R 4.0.0) ## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.0) ## htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) ## htmlwidgets 1.5.1 2019-10-08 [1] CRAN (R 4.0.0) ## httpuv 1.5.4 2020-06-06 [1] CRAN (R 4.0.2) ## httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) ## igraph 1.2.5 2020-03-19 [1] CRAN (R 4.0.0) ## interactiveDisplayBase 1.26.3 2020-06-02 [1] Bioconductor ## IRanges * 2.22.2 2020-05-21 [1] Bioconductor ## irlba 2.3.3 2019-02-05 [1] CRAN (R 4.0.0) ## jsonlite 1.7.0 2020-06-25 [1] CRAN (R 4.0.0) ## knitr 1.29 2020-06-23 [1] CRAN (R 4.0.0) ## labeling 0.3 2014-08-23 [1] CRAN (R 4.0.0) ## later 1.1.0.1 2020-06-05 [1] CRAN (R 4.0.2) ## lattice 0.20-41 2020-04-02 [1] CRAN (R 4.0.2) ## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.0.0) ## lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) ## limma 3.44.3 2020-06-12 [1] Bioconductor ## locfit 1.5-9.4 2020-03-25 [1] CRAN (R 4.0.0) ## magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0) ## Matrix 1.2-18 2019-11-27 [1] CRAN (R 4.0.2) ## matrixStats * 0.56.0 2020-03-13 [1] CRAN (R 4.0.0) ## memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) ## mime 0.9 2020-02-04 [1] CRAN (R 4.0.0) ## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0) ## pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2) ## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) ## plotly * 4.9.2.1 2020-04-04 [1] CRAN (R 4.0.0) ## promises 1.1.1 2020-06-09 [1] CRAN (R 4.0.2) ## pryr 0.1.4 2018-02-18 [1] CRAN (R 4.0.0) ## purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) ## R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) ## rappdirs 0.3.1 2016-03-28 [1] CRAN (R 4.0.0) ## Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.2) ## RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 4.0.0) ## rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2) ## rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.0) ## RSpectra 0.16-0 2019-12-01 [1] CRAN (R 4.0.0) ## RSQLite 2.2.0 2020-01-07 [1] CRAN (R 4.0.0) ## rsvd 1.0.3 2020-02-17 [1] CRAN (R 4.0.0) ## S4Vectors * 0.26.1 2020-05-16 [1] Bioconductor ## scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.0) ## scater * 1.16.2 2020-06-26 [1] Bioconductor ## scran * 1.16.0 2020-04-27 [1] Bioconductor ## scRNAseq * 2.2.0 2020-05-07 [1] Bioconductor ## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) ## shiny 1.5.0 2020-06-23 [1] CRAN (R 4.0.2) ## showtext 0.8-1 2020-05-25 [1] CRAN (R 4.0.2) ## showtextdb 3.0 2020-06-04 [1] CRAN (R 4.0.2) ## SingleCellExperiment * 1.10.1 2020-04-28 [1] Bioconductor ## statmod 1.4.34 2020-02-17 [1] CRAN (R 4.0.0) ## stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0) ## stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) ## SummarizedExperiment * 1.18.2 2020-07-14 [1] Bioconductor ## sysfonts 0.8.1 2020-05-08 [1] CRAN (R 4.0.0) ## tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2) ## tidyr 1.1.0 2020-05-20 [1] CRAN (R 4.0.2) ## tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.2) ## uwot 0.1.8 2020-03-16 [1] CRAN (R 4.0.0) ## vctrs 0.3.2 2020-07-15 [1] CRAN (R 4.0.2) ## vipor 0.4.5 2017-03-22 [1] CRAN (R 4.0.0) ## viridis 0.5.1 2018-03-29 [1] CRAN (R 4.0.0) ## viridisLite 0.3.0 2018-02-01 [1] CRAN (R 4.0.0) ## whisker 0.4 2019-08-28 [1] CRAN (R 4.0.0) ## withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0) ## xaringan 0.16 2020-03-31 [1] CRAN (R 4.0.0) ## xaringanthemer * 0.3.0 2020-05-04 [1] CRAN (R 4.0.0) ## xfun 0.16 2020-07-24 [1] CRAN (R 4.0.2) ## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.0.0) ## XVector 0.28.0 2020-04-27 [1] Bioconductor ## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) ## zlibbioc 1.34.0 2020-04-27 [1] Bioconductor ## ## [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library ``` ]]