Retrieve the binding sites and genome location for a given transcription factor.

get_binding_sites(regulondb, transcription_factor, output_format = "GRanges")

Arguments

regulondb

A regulondb() object.

transcription_factor

name of the transcription factor.

output_format

The output object. Can be either a GRanges (default) or Biostrings.

Value

Either a GRanges object or a Biostrings object summarizing information about the binding sites of the transcription factors.

Author

José Alquicira Hernández, Jacques van Helden, Joselyn Chávez

Examples

## Connect to the RegulonDB database if necessary
if (!exists("regulondb_conn")) regulondb_conn <- connect_database()
#> snapshotDate(): 2021-10-20

## Build the regulon db object
e_coli_regulondb <-
    regulondb(
        database_conn = regulondb_conn,
        organism = "E.coli",
        database_version = "1",
        genome_version = "1"
    )

## Get the binding sites for AraC
get_binding_sites(e_coli_regulondb, transcription_factor = "AraC")
#> GRanges object with 15 ranges and 1 metadata column:
#>                          seqnames          ranges strand |
#>                             <Rle>       <IRanges>  <Rle> |
#>   ECK120015742-araB-araC   E.coli     70110-70126      + |
#>   ECK120012328-araB-araC   E.coli     70131-70147      + |
#>   ECK120012320-araB-araC   E.coli     70184-70200      - |
#>   ECK120012323-araB-araC   E.coli     70205-70221      - |
#>   ECK120012603-araB-araC   E.coli     70342-70358      - |
#>                      ...      ...             ...    ... .
#>        ECK120012333-araF   E.coli 1986396-1986412      - |
#>        ECK120012915-araE   E.coli 2982244-2982260      - |
#>        ECK120012913-araE   E.coli 2982265-2982281      - |
#>        ECK125108641-xylA   E.coli 3730824-3730840      - |
#>        ECK125108643-xylA   E.coli 3730847-3730863      - |
#>                                        sequence
#>                                     <character>
#>   ECK120015742-araB-araC ataaaaagcgTCAGGTAGGA..
#>   ECK120012328-araB-araC ccgctaatctTATGGATAAA..
#>   ECK120012320-araB-araC tctataatcaCGGCAGAAAA..
#>   ECK120012323-araB-araC caaaaacgcgTAACAAAAGT..
#>   ECK120012603-araB-araC attcagagaaGAAACCAATT..
#>                      ...                    ...
#>        ECK120012333-araF ccaaagacaaCAAGGATTTC..
#>        ECK120012915-araE tccatatttaTGCTGTTTCC..
#>        ECK120012913-araE cgacatgtcgCAGCAATTTA..
#>        ECK125108641-xylA taacataattGAGCAACTGA..
#>        ECK125108643-xylA attatctcaaTAGCAGTGTG..
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths