From Bioconductor users to developers: our first community submission
A couple years ago we started the Community of Bioinformatics Software Developers (CDSB in Spanish) as because we were concerned with the very low representation of Latin Americans in the Bioconductor community, and the R community in general. For the full story check this this blog post.
Since we started CDSB, part of our goal been to help Bioconductor users transition into developers. To achieve this, we organized one week long courses in Mexico at low cost during the summers of 2018 and 2019 in partnership with the TIBs leadership (NNB-UNAM) and RMB. We plan to continue organizing these courses: except a 2020 announcement next week!
While these workshops allow us to reach up to 40 students in person for a week, we have been borrowing methods from others to try to interact more frequently and help throughout this process 1. But we now went a bit further.
In 2018, CDSB students worked on R/Bioconductor packages for two days during our workshop. One of them, Joselyn ChΓ‘vez, picked up the regutools project and continued working on it with advice from Heladia Salgado. We eventually set up meetings where Alejandro Reyes and myself would advice Joselyn, Carmina Barberena-Jonas and Jesus Emiliano Sotelo-Fonseca on how to best proceed.
I just sent @Bioconductor's 2020 conf deadline reminders to @CDSBMexico @LieberInstitute @GenomicsAtJHU @LatinR_Conf @rOpenSci @jhubiostat ποΈ
— π²π½ Leonardo Collado-Torres (@lcolladotor) February 21, 2020
It's the conf that started my #rstats career & I highly recommend it!
BioC usage intro https://t.co/8NaV68vUsAhttps://t.co/ATzkKs7c04 pic.twitter.com/km8vRfyHcY
Weβve known for months that the deadline for BioC2020 talk/poster proposals was March 3rd. So we designed a plan that would allow them to submit regutools prior to that deadline, then submit a proposal to present it (as well as submit proposals to present their own research projects) in order to increase their likelihood of getting a BioC2020 travel scholarship.
And we were able to accomplish this plan! Well, at least the part under their and our control. That is, check out the regutools Bioconductor submission. Thus we are incredibly excited to announce that the CDSB website now has a βBioconductor Developers Alumniβ section!
We are also thrilled to announce that the RegulonDB team has given us the go ahead signal to write a manuscript about regutools. So youβll soon see a pre-print about it.
Though I personally really hope that Joselyn, Carmina, Emiliano and many other CDSB alumni will be able to go to BioC2020 and other R conferences.
We couldnβt have gotten this far without all the support weβve received over the years. So I would like to thank all our previous sponsors 2, colleagues whoβve encouraged us to keep going, and CDSB alumni like Joselyn, Carmina, and Emiliano who believed in our ideas and spent their own time making them a reality.
I leave you here with a short introduction to regutools.
regutools
The goal of regutools
is to provide an R interface for extracting and processing data from RegulonDB. This package was created as a collaboration by members of the Community of Bioinformatics Software Developers (CDSB in Spanish).
For more details, please check the documentation website or the Bioconductor package landing page here.
Installation
You can install the released version of regutools
from Bioconductor with:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("regutools")
## Check that you have a valid Bioconductor installation
BiocManager::valid()
And the development version from GitHub with:
## Since the package is currently going through the Bioconductor
## revision process, you need to use this code instead:
BiocManager::install("ComunidadBioinfo/regutools")
Example
This is a basic example which shows you how to use regutools
. For more details, please check the vignette.
library('regutools')
## basic example code
## Connect to the RegulonDB database if necessary
if(!exists('regulondb_conn')) regulondb_conn <- connect_database()
## snapshotDate(): 2019-10-29
## Build a regulondb object
e_coli_regulondb <-
regulondb(
database_conn = regulondb_conn,
organism = "E.coli",
database_version = "1",
genome_version = "1"
)
## Get the araC regulators
araC_regulation <-
get_gene_regulators(
e_coli_regulondb,
genes = c("araC"),
format = "multirow",
output.type = "TF"
)
## Summarize the araC regulation
get_regulatory_summary(e_coli_regulondb, araC_regulation)
## regulondb_result with 3 rows and 7 columns
## TF Regulated_genes_per_TF Percent Activator Repressor Dual
## <factor> <factor> <factor> <factor> <factor> <factor>
## 1 AraC 1 33.3333333333333 0 0 1
## 2 CRP 1 33.3333333333333 1 0 0
## 3 XylR 1 33.3333333333333 0 1 0
## Regulated_genes
## <factor>
## 1 araC
## 2 araC
## 3 araC
Citation
Below is the citation output from using citation('regutools')
in R. Please
run this yourself to check for any updates on how to cite regutools.
citation('regutools')
##
## ChΓ‘vez J, Barberena-Jonas C, Sotelo-Fonseca JE, Alquicira-Hernandez J,
## Salgado H, Collado-Torres L, Reyes A (2020). _regutools: an R package
## for data extraction from RegulonDB_. doi: 10.18129/B9.bioc.regutools
## (URL: https://doi.org/10.18129/B9.bioc.regutools),
## https://github.com/comunidadbioinfo/regutools - R package version
## 0.99.0, <URL: http://www.bioconductor.org/packages/regutools>.
##
## ChΓ‘vez J, Barberena-Jonas C, Sotelo-Fonseca JE, Alquicira-Hernandez J,
## Salgado H, Collado-Torres L, Reyes A (2020). "Programmatic access to
## bacterial regulatory networks with regutools." _bioRxiv_. doi:
## 10.1101/xxxyyy (URL: https://doi.org/10.1101/xxxyyy), <URL:
## https://doi.org/10.1101/xxxyyy>.
##
## To see these entries in BibTeX format, use 'print(<citation>,
## bibtex=TRUE)', 'toBibtex(.)', or set
## 'options(citation.bibtex.max=999)'.
Reproducibility information
options(width = 120)
sessioninfo::session_info()
## β Session info βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
## setting value
## version R version 3.6.2 (2019-12-12)
## os macOS Catalina 10.15.2
## system x86_64, darwin15.6.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2020-02-29
##
## β Packages βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
## package * version date lib source
## acepack 1.4.1 2016-10-29 [1] CRAN (R 3.6.0)
## AnnotationDbi 1.48.0 2019-10-29 [1] Bioconductor
## AnnotationFilter 1.10.0 2019-10-29 [1] Bioconductor
## AnnotationHub 2.18.0 2019-10-29 [1] Bioconductor
## askpass 1.1 2019-01-13 [1] CRAN (R 3.6.0)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
## backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.0)
## base64enc 0.1-3 2015-07-28 [1] CRAN (R 3.6.0)
## Biobase 2.46.0 2019-10-29 [1] Bioconductor
## BiocFileCache 1.10.2 2019-11-08 [1] Bioconductor
## BiocGenerics 0.32.0 2019-10-29 [1] Bioconductor
## BiocManager 1.30.10 2019-11-16 [1] CRAN (R 3.6.1)
## BiocParallel 1.20.1 2019-12-21 [1] Bioconductor
## BiocVersion 3.10.1 2019-06-06 [1] Bioconductor
## biomaRt 2.42.0 2019-10-29 [1] Bioconductor
## Biostrings 2.54.0 2019-10-29 [1] Bioconductor
## biovizBase 1.34.1 2019-12-04 [1] Bioconductor
## bit 1.1-15.2 2020-02-10 [1] CRAN (R 3.6.0)
## bit64 0.9-7 2017-05-08 [1] CRAN (R 3.6.0)
## bitops 1.0-6 2013-08-17 [1] CRAN (R 3.6.0)
## blob 1.2.1 2020-01-20 [1] CRAN (R 3.6.0)
## blogdown 0.17 2019-11-13 [1] CRAN (R 3.6.1)
## bookdown 0.17 2020-01-11 [1] CRAN (R 3.6.0)
## BSgenome 1.54.0 2019-10-29 [1] Bioconductor
## checkmate 1.9.4 2019-07-04 [1] CRAN (R 3.6.0)
## cli 2.0.1 2020-01-08 [1] CRAN (R 3.6.0)
## cluster 2.1.0 2019-06-19 [1] CRAN (R 3.6.2)
## colorout * 1.2-1 2019-05-07 [1] Github (jalvesaq/colorout@7ea9440)
## colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.0)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
## curl 4.3 2019-12-02 [1] CRAN (R 3.6.0)
## data.table 1.12.8 2019-12-09 [1] CRAN (R 3.6.1)
## DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.0)
## dbplyr 1.4.2 2019-06-17 [1] CRAN (R 3.6.0)
## DelayedArray 0.12.2 2020-01-06 [1] Bioconductor
## dichromat 2.0-0 2013-01-24 [1] CRAN (R 3.6.0)
## digest 0.6.25 2020-02-23 [1] CRAN (R 3.6.0)
## dplyr 0.8.4 2020-01-31 [1] CRAN (R 3.6.0)
## ensembldb 2.10.2 2019-11-20 [1] Bioconductor
## evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
## fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.0)
## fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.0)
## foreign 0.8-75 2020-01-20 [1] CRAN (R 3.6.0)
## Formula 1.2-3 2018-05-03 [1] CRAN (R 3.6.0)
## GenomeInfoDb 1.22.0 2019-10-29 [1] Bioconductor
## GenomeInfoDbData 1.2.2 2019-10-31 [1] Bioconductor
## GenomicAlignments 1.22.1 2019-11-12 [1] Bioconductor
## GenomicFeatures 1.38.1 2020-01-22 [1] Bioconductor
## GenomicRanges 1.38.0 2019-10-29 [1] Bioconductor
## ggplot2 3.2.1 2019-08-10 [1] CRAN (R 3.6.0)
## glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0)
## graph 1.64.0 2019-10-29 [1] Bioconductor
## gridExtra 2.3 2017-09-09 [1] CRAN (R 3.6.0)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.0)
## Gviz 1.30.1 2020-01-23 [1] Bioconductor
## Hmisc 4.3-0 2019-11-07 [1] CRAN (R 3.6.0)
## hms 0.5.3 2020-01-08 [1] CRAN (R 3.6.0)
## htmlTable 1.13.3 2019-12-04 [1] CRAN (R 3.6.1)
## htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.0)
## htmlwidgets 1.5.1 2019-10-08 [1] CRAN (R 3.6.0)
## httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.0)
## httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.0)
## igraph 1.2.4.2 2019-11-27 [1] CRAN (R 3.6.0)
## interactiveDisplayBase 1.24.0 2019-10-29 [1] Bioconductor
## IRanges 2.20.2 2020-01-13 [1] Bioconductor
## jpeg 0.1-8.1 2019-10-24 [1] CRAN (R 3.6.1)
## knitr 1.27 2020-01-16 [1] CRAN (R 3.6.0)
## later 1.0.0 2019-10-04 [1] CRAN (R 3.6.0)
## lattice 0.20-38 2018-11-04 [1] CRAN (R 3.6.2)
## latticeExtra 0.6-29 2019-12-19 [1] CRAN (R 3.6.0)
## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.0)
## lifecycle 0.1.0 2019-08-01 [1] CRAN (R 3.6.0)
## magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
## Matrix 1.2-18 2019-11-27 [1] CRAN (R 3.6.2)
## matrixStats 0.55.0 2019-09-07 [1] CRAN (R 3.6.0)
## memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0)
## mime 0.9 2020-02-04 [1] CRAN (R 3.6.0)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.0)
## nnet 7.3-12 2016-02-02 [1] CRAN (R 3.6.2)
## openssl 1.4.1 2019-07-18 [1] CRAN (R 3.6.0)
## pillar 1.4.3 2019-12-20 [1] CRAN (R 3.6.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1)
## png 0.1-7 2013-12-03 [1] CRAN (R 3.6.0)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.2)
## progress 1.2.2 2019-05-16 [1] CRAN (R 3.6.0)
## promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.0)
## ProtGenerics 1.18.0 2019-10-29 [1] Bioconductor
## purrr 0.3.3 2019-10-18 [1] CRAN (R 3.6.0)
## R.methodsS3 1.7.1 2016-02-16 [1] CRAN (R 3.6.0)
## R.oo 1.23.0 2019-11-03 [1] CRAN (R 3.6.0)
## R.utils 2.9.2 2019-12-08 [1] CRAN (R 3.6.1)
## R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1)
## rappdirs 0.3.1 2016-03-28 [1] CRAN (R 3.6.0)
## RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 3.6.0)
## Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.0)
## RCurl 1.98-1.1 2020-01-19 [1] CRAN (R 3.6.0)
## RCy3 2.6.3 2020-01-12 [1] Bioconductor
## regutools * 0.99.0 2020-02-29 [1] Github (comunidadbioinfo/regutools@0cb5b18)
## RJSONIO 1.3-1.4 2020-01-15 [1] CRAN (R 3.6.0)
## rlang 0.4.4 2020-01-28 [1] CRAN (R 3.6.0)
## rmarkdown 2.1 2020-01-20 [1] CRAN (R 3.6.0)
## rpart 4.1-15 2019-04-12 [1] CRAN (R 3.6.2)
## Rsamtools 2.2.1 2019-11-06 [1] Bioconductor
## RSQLite 2.2.0 2020-01-07 [1] CRAN (R 3.6.0)
## rstudioapi 0.11 2020-02-07 [1] CRAN (R 3.6.0)
## rtracklayer 1.46.0 2019-10-29 [1] Bioconductor
## S4Vectors 0.24.3 2020-01-18 [1] Bioconductor
## scales 1.1.0 2019-11-18 [1] CRAN (R 3.6.1)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
## shiny 1.4.0 2019-10-10 [1] CRAN (R 3.6.0)
## stringi 1.4.6 2020-02-17 [1] CRAN (R 3.6.0)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
## SummarizedExperiment 1.16.1 2019-12-19 [1] Bioconductor
## survival 3.1-8 2019-12-03 [1] CRAN (R 3.6.2)
## tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.0)
## tidyselect 1.0.0 2020-01-27 [1] CRAN (R 3.6.0)
## VariantAnnotation 1.32.0 2019-10-29 [1] Bioconductor
## vctrs 0.2.3 2020-02-20 [1] CRAN (R 3.6.0)
## withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
## xfun 0.12 2020-01-13 [1] CRAN (R 3.6.0)
## XML 3.99-0.3 2020-01-20 [1] CRAN (R 3.6.0)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.0)
## XVector 0.26.0 2019-10-29 [1] Bioconductor
## yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.0)
## zlibbioc 1.32.0 2019-10-29 [1] Bioconductor
##
## [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
As always, thank you rOpenSci unconf 2018!β©οΈ
Sponsors are listed on each yearβs workshop announcement. If you would like to sponsor our 2020 workshop, please get in touch with us. Thanks!!β©οΈ