From Bioconductor users to developers: our first community submission

A couple years ago we started the Community of Bioinformatics Software Developers (CDSB in Spanish) as because we were concerned with the very low representation of Latin Americans in the Bioconductor community, and the R community in general. For the full story check this this blog post.

Since we started CDSB, part of our goal been to help Bioconductor users transition into developers. To achieve this, we organized one week long courses in Mexico at low cost during the summers of 2018 and 2019 in partnership with the TIBs leadership (NNB-UNAM) and RMB. We plan to continue organizing these courses: except a 2020 announcement next week!

While these workshops allow us to reach up to 40 students in person for a week, we have been borrowing methods from others to try to interact more frequently and help throughout this process 1. But we now went a bit further.

In 2018, CDSB students worked on R/Bioconductor packages for two days during our workshop. One of them, Joselyn ChΓ‘vez, picked up the regutools project and continued working on it with advice from Heladia Salgado. We eventually set up meetings where Alejandro Reyes and myself would advice Joselyn, Carmina Barberena-Jonas and Jesus Emiliano Sotelo-Fonseca on how to best proceed.

We’ve known for months that the deadline for BioC2020 talk/poster proposals was March 3rd. So we designed a plan that would allow them to submit regutools prior to that deadline, then submit a proposal to present it (as well as submit proposals to present their own research projects) in order to increase their likelihood of getting a BioC2020 travel scholarship.

And we were able to accomplish this plan! Well, at least the part under their and our control. That is, check out the regutools Bioconductor submission. Thus we are incredibly excited to announce that the CDSB website now has a β€œBioconductor Developers Alumni” section!

Bioconductor Developers Alumni We are also thrilled to announce that the RegulonDB team has given us the go ahead signal to write a manuscript about regutools. So you’ll soon see a pre-print about it.

via GIPHY

Though I personally really hope that Joselyn, Carmina, Emiliano and many other CDSB alumni will be able to go to BioC2020 and other R conferences.

We couldn’t have gotten this far without all the support we’ve received over the years. So I would like to thank all our previous sponsors 2, colleagues who’ve encouraged us to keep going, and CDSB alumni like Joselyn, Carmina, and Emiliano who believed in our ideas and spent their own time making them a reality.

via GIPHY

I leave you here with a short introduction to regutools.

regutools

The goal of regutools is to provide an R interface for extracting and processing data from RegulonDB. This package was created as a collaboration by members of the Community of Bioinformatics Software Developers (CDSB in Spanish).

For more details, please check the documentation website or the Bioconductor package landing page here.

Installation

You can install the released version of regutools from Bioconductor with:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("regutools")

## Check that you have a valid Bioconductor installation
BiocManager::valid()

And the development version from GitHub with:

## Since the package is currently going through the Bioconductor
## revision process, you need to use this code instead:
BiocManager::install("ComunidadBioinfo/regutools")

Example

This is a basic example which shows you how to use regutools. For more details, please check the vignette.

library('regutools')
## basic example code

## Connect to the RegulonDB database if necessary
if(!exists('regulondb_conn')) regulondb_conn <- connect_database()
## snapshotDate(): 2019-10-29
## Build a regulondb object
e_coli_regulondb <-
    regulondb(
        database_conn = regulondb_conn,
        organism = "E.coli",
        database_version = "1",
        genome_version = "1"
    )

## Get the araC regulators
araC_regulation <-
    get_gene_regulators(
        e_coli_regulondb,
        genes = c("araC"),
        format = "multirow",
        output.type = "TF"
    )

## Summarize the araC regulation
get_regulatory_summary(e_coli_regulondb, araC_regulation)
## regulondb_result with 3 rows and 7 columns
##         TF Regulated_genes_per_TF          Percent Activator Repressor     Dual
##   <factor>               <factor>         <factor>  <factor>  <factor> <factor>
## 1     AraC                      1 33.3333333333333         0         0        1
## 2      CRP                      1 33.3333333333333         1         0        0
## 3     XylR                      1 33.3333333333333         0         1        0
##   Regulated_genes
##          <factor>
## 1            araC
## 2            araC
## 3            araC

Citation

Below is the citation output from using citation('regutools') in R. Please run this yourself to check for any updates on how to cite regutools.

citation('regutools')
## 
## ChΓ‘vez J, Barberena-Jonas C, Sotelo-Fonseca JE, Alquicira-Hernandez J,
## Salgado H, Collado-Torres L, Reyes A (2020). _regutools: an R package
## for data extraction from RegulonDB_. doi: 10.18129/B9.bioc.regutools
## (URL: https://doi.org/10.18129/B9.bioc.regutools),
## https://github.com/comunidadbioinfo/regutools - R package version
## 0.99.0, <URL: http://www.bioconductor.org/packages/regutools>.
## 
## ChΓ‘vez J, Barberena-Jonas C, Sotelo-Fonseca JE, Alquicira-Hernandez J,
## Salgado H, Collado-Torres L, Reyes A (2020). "Programmatic access to
## bacterial regulatory networks with regutools." _bioRxiv_. doi:
## 10.1101/xxxyyy (URL: https://doi.org/10.1101/xxxyyy), <URL:
## https://doi.org/10.1101/xxxyyy>.
## 
## To see these entries in BibTeX format, use 'print(<citation>,
## bibtex=TRUE)', 'toBibtex(.)', or set
## 'options(citation.bibtex.max=999)'.

Reproducibility information

options(width = 120)
sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 3.6.2 (2019-12-12)
##  os       macOS Catalina 10.15.2      
##  system   x86_64, darwin15.6.0        
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  ctype    en_US.UTF-8                 
##  tz       America/New_York            
##  date     2020-02-29                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package                * version  date       lib source                                     
##  acepack                  1.4.1    2016-10-29 [1] CRAN (R 3.6.0)                             
##  AnnotationDbi            1.48.0   2019-10-29 [1] Bioconductor                               
##  AnnotationFilter         1.10.0   2019-10-29 [1] Bioconductor                               
##  AnnotationHub            2.18.0   2019-10-29 [1] Bioconductor                               
##  askpass                  1.1      2019-01-13 [1] CRAN (R 3.6.0)                             
##  assertthat               0.2.1    2019-03-21 [1] CRAN (R 3.6.0)                             
##  backports                1.1.5    2019-10-02 [1] CRAN (R 3.6.0)                             
##  base64enc                0.1-3    2015-07-28 [1] CRAN (R 3.6.0)                             
##  Biobase                  2.46.0   2019-10-29 [1] Bioconductor                               
##  BiocFileCache            1.10.2   2019-11-08 [1] Bioconductor                               
##  BiocGenerics             0.32.0   2019-10-29 [1] Bioconductor                               
##  BiocManager              1.30.10  2019-11-16 [1] CRAN (R 3.6.1)                             
##  BiocParallel             1.20.1   2019-12-21 [1] Bioconductor                               
##  BiocVersion              3.10.1   2019-06-06 [1] Bioconductor                               
##  biomaRt                  2.42.0   2019-10-29 [1] Bioconductor                               
##  Biostrings               2.54.0   2019-10-29 [1] Bioconductor                               
##  biovizBase               1.34.1   2019-12-04 [1] Bioconductor                               
##  bit                      1.1-15.2 2020-02-10 [1] CRAN (R 3.6.0)                             
##  bit64                    0.9-7    2017-05-08 [1] CRAN (R 3.6.0)                             
##  bitops                   1.0-6    2013-08-17 [1] CRAN (R 3.6.0)                             
##  blob                     1.2.1    2020-01-20 [1] CRAN (R 3.6.0)                             
##  blogdown                 0.17     2019-11-13 [1] CRAN (R 3.6.1)                             
##  bookdown                 0.17     2020-01-11 [1] CRAN (R 3.6.0)                             
##  BSgenome                 1.54.0   2019-10-29 [1] Bioconductor                               
##  checkmate                1.9.4    2019-07-04 [1] CRAN (R 3.6.0)                             
##  cli                      2.0.1    2020-01-08 [1] CRAN (R 3.6.0)                             
##  cluster                  2.1.0    2019-06-19 [1] CRAN (R 3.6.2)                             
##  colorout               * 1.2-1    2019-05-07 [1] Github (jalvesaq/colorout@7ea9440)         
##  colorspace               1.4-1    2019-03-18 [1] CRAN (R 3.6.0)                             
##  crayon                   1.3.4    2017-09-16 [1] CRAN (R 3.6.0)                             
##  curl                     4.3      2019-12-02 [1] CRAN (R 3.6.0)                             
##  data.table               1.12.8   2019-12-09 [1] CRAN (R 3.6.1)                             
##  DBI                      1.1.0    2019-12-15 [1] CRAN (R 3.6.0)                             
##  dbplyr                   1.4.2    2019-06-17 [1] CRAN (R 3.6.0)                             
##  DelayedArray             0.12.2   2020-01-06 [1] Bioconductor                               
##  dichromat                2.0-0    2013-01-24 [1] CRAN (R 3.6.0)                             
##  digest                   0.6.25   2020-02-23 [1] CRAN (R 3.6.0)                             
##  dplyr                    0.8.4    2020-01-31 [1] CRAN (R 3.6.0)                             
##  ensembldb                2.10.2   2019-11-20 [1] Bioconductor                               
##  evaluate                 0.14     2019-05-28 [1] CRAN (R 3.6.0)                             
##  fansi                    0.4.1    2020-01-08 [1] CRAN (R 3.6.0)                             
##  fastmap                  1.0.1    2019-10-08 [1] CRAN (R 3.6.0)                             
##  foreign                  0.8-75   2020-01-20 [1] CRAN (R 3.6.0)                             
##  Formula                  1.2-3    2018-05-03 [1] CRAN (R 3.6.0)                             
##  GenomeInfoDb             1.22.0   2019-10-29 [1] Bioconductor                               
##  GenomeInfoDbData         1.2.2    2019-10-31 [1] Bioconductor                               
##  GenomicAlignments        1.22.1   2019-11-12 [1] Bioconductor                               
##  GenomicFeatures          1.38.1   2020-01-22 [1] Bioconductor                               
##  GenomicRanges            1.38.0   2019-10-29 [1] Bioconductor                               
##  ggplot2                  3.2.1    2019-08-10 [1] CRAN (R 3.6.0)                             
##  glue                     1.3.1    2019-03-12 [1] CRAN (R 3.6.0)                             
##  graph                    1.64.0   2019-10-29 [1] Bioconductor                               
##  gridExtra                2.3      2017-09-09 [1] CRAN (R 3.6.0)                             
##  gtable                   0.3.0    2019-03-25 [1] CRAN (R 3.6.0)                             
##  Gviz                     1.30.1   2020-01-23 [1] Bioconductor                               
##  Hmisc                    4.3-0    2019-11-07 [1] CRAN (R 3.6.0)                             
##  hms                      0.5.3    2020-01-08 [1] CRAN (R 3.6.0)                             
##  htmlTable                1.13.3   2019-12-04 [1] CRAN (R 3.6.1)                             
##  htmltools                0.4.0    2019-10-04 [1] CRAN (R 3.6.0)                             
##  htmlwidgets              1.5.1    2019-10-08 [1] CRAN (R 3.6.0)                             
##  httpuv                   1.5.2    2019-09-11 [1] CRAN (R 3.6.0)                             
##  httr                     1.4.1    2019-08-05 [1] CRAN (R 3.6.0)                             
##  igraph                   1.2.4.2  2019-11-27 [1] CRAN (R 3.6.0)                             
##  interactiveDisplayBase   1.24.0   2019-10-29 [1] Bioconductor                               
##  IRanges                  2.20.2   2020-01-13 [1] Bioconductor                               
##  jpeg                     0.1-8.1  2019-10-24 [1] CRAN (R 3.6.1)                             
##  knitr                    1.27     2020-01-16 [1] CRAN (R 3.6.0)                             
##  later                    1.0.0    2019-10-04 [1] CRAN (R 3.6.0)                             
##  lattice                  0.20-38  2018-11-04 [1] CRAN (R 3.6.2)                             
##  latticeExtra             0.6-29   2019-12-19 [1] CRAN (R 3.6.0)                             
##  lazyeval                 0.2.2    2019-03-15 [1] CRAN (R 3.6.0)                             
##  lifecycle                0.1.0    2019-08-01 [1] CRAN (R 3.6.0)                             
##  magrittr                 1.5      2014-11-22 [1] CRAN (R 3.6.0)                             
##  Matrix                   1.2-18   2019-11-27 [1] CRAN (R 3.6.2)                             
##  matrixStats              0.55.0   2019-09-07 [1] CRAN (R 3.6.0)                             
##  memoise                  1.1.0    2017-04-21 [1] CRAN (R 3.6.0)                             
##  mime                     0.9      2020-02-04 [1] CRAN (R 3.6.0)                             
##  munsell                  0.5.0    2018-06-12 [1] CRAN (R 3.6.0)                             
##  nnet                     7.3-12   2016-02-02 [1] CRAN (R 3.6.2)                             
##  openssl                  1.4.1    2019-07-18 [1] CRAN (R 3.6.0)                             
##  pillar                   1.4.3    2019-12-20 [1] CRAN (R 3.6.0)                             
##  pkgconfig                2.0.3    2019-09-22 [1] CRAN (R 3.6.1)                             
##  png                      0.1-7    2013-12-03 [1] CRAN (R 3.6.0)                             
##  prettyunits              1.1.1    2020-01-24 [1] CRAN (R 3.6.2)                             
##  progress                 1.2.2    2019-05-16 [1] CRAN (R 3.6.0)                             
##  promises                 1.1.0    2019-10-04 [1] CRAN (R 3.6.0)                             
##  ProtGenerics             1.18.0   2019-10-29 [1] Bioconductor                               
##  purrr                    0.3.3    2019-10-18 [1] CRAN (R 3.6.0)                             
##  R.methodsS3              1.7.1    2016-02-16 [1] CRAN (R 3.6.0)                             
##  R.oo                     1.23.0   2019-11-03 [1] CRAN (R 3.6.0)                             
##  R.utils                  2.9.2    2019-12-08 [1] CRAN (R 3.6.1)                             
##  R6                       2.4.1    2019-11-12 [1] CRAN (R 3.6.1)                             
##  rappdirs                 0.3.1    2016-03-28 [1] CRAN (R 3.6.0)                             
##  RColorBrewer             1.1-2    2014-12-07 [1] CRAN (R 3.6.0)                             
##  Rcpp                     1.0.3    2019-11-08 [1] CRAN (R 3.6.0)                             
##  RCurl                    1.98-1.1 2020-01-19 [1] CRAN (R 3.6.0)                             
##  RCy3                     2.6.3    2020-01-12 [1] Bioconductor                               
##  regutools              * 0.99.0   2020-02-29 [1] Github (comunidadbioinfo/regutools@0cb5b18)
##  RJSONIO                  1.3-1.4  2020-01-15 [1] CRAN (R 3.6.0)                             
##  rlang                    0.4.4    2020-01-28 [1] CRAN (R 3.6.0)                             
##  rmarkdown                2.1      2020-01-20 [1] CRAN (R 3.6.0)                             
##  rpart                    4.1-15   2019-04-12 [1] CRAN (R 3.6.2)                             
##  Rsamtools                2.2.1    2019-11-06 [1] Bioconductor                               
##  RSQLite                  2.2.0    2020-01-07 [1] CRAN (R 3.6.0)                             
##  rstudioapi               0.11     2020-02-07 [1] CRAN (R 3.6.0)                             
##  rtracklayer              1.46.0   2019-10-29 [1] Bioconductor                               
##  S4Vectors                0.24.3   2020-01-18 [1] Bioconductor                               
##  scales                   1.1.0    2019-11-18 [1] CRAN (R 3.6.1)                             
##  sessioninfo              1.1.1    2018-11-05 [1] CRAN (R 3.6.0)                             
##  shiny                    1.4.0    2019-10-10 [1] CRAN (R 3.6.0)                             
##  stringi                  1.4.6    2020-02-17 [1] CRAN (R 3.6.0)                             
##  stringr                  1.4.0    2019-02-10 [1] CRAN (R 3.6.0)                             
##  SummarizedExperiment     1.16.1   2019-12-19 [1] Bioconductor                               
##  survival                 3.1-8    2019-12-03 [1] CRAN (R 3.6.2)                             
##  tibble                   2.1.3    2019-06-06 [1] CRAN (R 3.6.0)                             
##  tidyselect               1.0.0    2020-01-27 [1] CRAN (R 3.6.0)                             
##  VariantAnnotation        1.32.0   2019-10-29 [1] Bioconductor                               
##  vctrs                    0.2.3    2020-02-20 [1] CRAN (R 3.6.0)                             
##  withr                    2.1.2    2018-03-15 [1] CRAN (R 3.6.0)                             
##  xfun                     0.12     2020-01-13 [1] CRAN (R 3.6.0)                             
##  XML                      3.99-0.3 2020-01-20 [1] CRAN (R 3.6.0)                             
##  xtable                   1.8-4    2019-04-21 [1] CRAN (R 3.6.0)                             
##  XVector                  0.26.0   2019-10-29 [1] Bioconductor                               
##  yaml                     2.2.1    2020-02-01 [1] CRAN (R 3.6.0)                             
##  zlibbioc                 1.32.0   2019-10-29 [1] Bioconductor                               
## 
## [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

  1. As always, thank you rOpenSci unconf 2018!β†©οΈŽ

  2. Sponsors are listed on each year’s workshop announcement. If you would like to sponsor our 2020 workshop, please get in touch with us. Thanks!!β†©οΈŽ

Avatar
Leonardo Collado-Torres, PhD
Research Scientist

Brain genomics #rstats coder working w/ @andrewejaffe @LieberInstitute. @lcgunam @jhubiostat @jtleek alumni. @LIBDrstats @CDSBMexico co-founder.

comments powered by Disqus

Related