1 Introduction

The tRNA package provides access to tRNA feature information for subsetting and visualization. Visualization functions are implemented to compare feature parameters of multiple tRNA sets and to correlate them to additional data.

As input the package expects a GRanges object with certain metadata columns. The following columns are required: tRNA_length, tRNA_type, tRNA_anticodon, tRNA_seq, tRNA_str, tRNA_CCA.end. The tRNA_str column must contain a valid dot bracket annotation. For more details please have a look at the vignette of the Structstrings package.

2 Loading tRNA information

To work with the tRNA package, tRNA information can be retrieved or loaded into a R session in a number of ways:

  1. A GRanges object can be constructed manually containing the required colums mentioned above.
  2. a tRNAscan result file can be loaded using the function import.tRNAscanAsGRanges() from the tRNAscanImport package
  3. selected tRNA information can be retrieved using the function import.tRNAdb() from the tRNAdbImport package

For the examples in this vignette a number of predefined GRanges objects are loaded.

library(tRNA)
library(Structstrings)
data("gr", package = "tRNA")

3 tRNA sequences and structures

To retrieve the sequences for individual tRNA structure elements the functions gettRNAstructureGRanges or gettRNAstructureSeqs can be used. Several optional arguments can be used to modify the result (See ?gettRNAstructureSeqs).

# just get the coordinates of the anticodonloop
gettRNAstructureGRanges(gr, structure = "anticodonLoop")
## $anticodonLoop
## IRanges object with 299 ranges and 0 metadata columns:
##           start       end     width
##       <integer> <integer> <integer>
##   TGG        31        37         7
##   TGC        32        38         7
##   CAA        31        37         7
##   AGA        31        37         7
##   TAA        31        37         7
##   ...       ...       ...       ...
##   CAT        32        38         7
##   GAA        31        37         7
##   TTA        31        37         7
##   TAC        32        38         7
##   CAT        32        38         7
gettRNAstructureSeqs(gr, joinFeatures = TRUE, structure = "anticodonLoop")
## $anticodonLoop
## RNAStringSet object of length 299:
##       width seq                                             names               
##   [1]     7 UUUGGGU                                         TGG
##   [2]     7 CUUGCAA                                         TGC
##   [3]     7 UUCAAGC                                         CAA
##   [4]     7 UUAGAAA                                         AGA
##   [5]     7 CUUAAGA                                         TAA
##   ...   ... ...
## [295]     7 CUCAUAA                                         CAT
## [296]     7 UUGAAGA                                         GAA
## [297]     7 UUUUAGU                                         TTA
## [298]     7 UUUACAC                                         TAC
## [299]     7 GUCAUGA                                         CAT

In addition, the sequences can be returned already joined to get a fully blank padded set of sequences. The boundaries of the individual structures is returned as metadata of the RNAStringSet object.

seqs <- gettRNAstructureSeqs(gr[1L:10L], joinCompletely = TRUE)
seqs
## RNAStringSet object of length 10:
##      width seq
##  [1]    85 GGGCGUGUGGUC-UAGU-GGUAU-GAUUCUCGC...------GCCUGGGUUCAAUUCCCAGCUCGCCCC
##  [2]    85 GGGCACAUGGCGCAGUU-GGU-AGCGCGCUUCC...------GCAUCGGUUCGAUUCCGGUUGCGUCCA
##  [3]    85 GGUUGUUUGGCC-GAGC-GGUAA-GGCGCCUGA...AA-GAUGCAAGAGUUCGAAUCUCUUAGCAACCA
##  [4]    85 GGCAACUUGGCC-GAGU-GGUAA-GGCGAAAGA...U-GCCCGCGCAGGUUCGAGUCCUGCAGUUGUCG
##  [5]    85 GGAGGGUUGGCC-GAGU-GGUAA-GGCGGCAGA...UUGUCCGCGCGAGUUCGAACCUCGCAUCCUUCA
##  [6]    85 GCGGAUUUAGCUCAGUU-GGG-AGAGCGCCAGA...------GCCUGUGUUCGAUCCACAGAAUUCGCA
##  [7]    85 GGUCUCUUGGCC-CAGUUGGUAA-GGCACCGUG...------ACAGCGGUUCGAUCCCGCUAGAGACCA
##  [8]    85 GCGCAAGUGGUUUAGU--GGU-AAAAUCCAACG...-------CCCCGGUUCGAUUCCGGGCUUGCGCA
##  [9]    85 GGCAACUUGGCC-GAGU-GGUAA-GGCGAAAGA...U-GCCCGCGCAGGUUCGAGUCCUGCAGUUGUCG
## [10]    85 GCUUCUAUGGCC-AAGUUGGUAA-GGCGCCACA...------ACAUCGGUUCAAAUCCGAUUGGAAGCA
# getting the tRNA structure boundaries
metadata(seqs)[["tRNA_structures"]]
## IRanges object with 15 ranges and 0 metadata columns:
##                           start       end     width
##                       <integer> <integer> <integer>
##   acceptorStem.prime5         1         7         7
##               Dprime5         8         9         2
##          DStem.prime5        10        13         4
##                 Dloop        14        23        10
##          DStem.prime3        24        27         4
##                   ...       ...       ...       ...
##          TStem.prime5        61        65         5
##                 Tloop        66        72         7
##          TStem.prime3        73        77         5
##   acceptorStem.prime3        78        84         7
##         discriminator        85        85         1

Be aware, that gettRNAstructureGRanges and gettRNAstructureSeqs might not be working as expected, if the tRNA sequences in questions are armless or deviate drastically from the canonical tRNA model. The functions in the tRNA packages were thouroughly tested using human mitochondrial tRNA and other tRNAs missing certain features. However, for fringe cases results may differ. If you encounter such a case, please report it with an example.

4 Subsetting tRNA sequences

Structure information of the tRNA can be queried for subsetting using several functions. For the following examples the functions hasAccpeptorStem and hasDloop are used.

gr[hasAcceptorStem(gr, unpaired = TRUE)]
# mismatches and bulged are subsets of unpaired
gr[hasAcceptorStem(gr, mismatches = TRUE)]
gr[hasAcceptorStem(gr, bulged = TRUE)]
# combination of different structure parameters
gr[hasAcceptorStem(gr, mismatches = TRUE) & 
     hasDloop(gr, length = 8L)]

Please have a look at the man page ?hasAccpeptorStem for all available subsetting functions.

5 Visualization

To get an overview of tRNA features and compare different datasets, the function gettRNAFeaturePlots is used. It accepts a named GRangesList as input. Internally it will calculate a list of features values based on the functions mentioned above and the data contained in the mcols of the GRanges objects.

# load tRNA data for E. coli and H. sapiens
data("gr_eco", package = "tRNA")
data("gr_human", package = "tRNA")

# get summary plots
grl <- GRangesList(Sce = gr,
                   Hsa = gr_human,
                   Eco = gr_eco)
plots <- gettRNAFeaturePlots(grl)
plots$length