raerdata 1.2.0
The raerdata
package contains datasets and databases used to illustrate
functionality to characterize RNA editing using the raer
package. Included in
the package are databases of known human and mouse RNA editing sites. Datasets
have been preprocessed to generate smaller examples suitable for quick
exploration of the data and demonstration of the raer
package.
if (!require("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
# The following initializes usage of Bioc devel
BiocManager::install(version = "devel")
BiocManager::install("raerdata")
library(raerdata)
Atlases of known human and mouse A-to-I RNA editing sites formatted into
GRanges
objects are provided.
The REDIportal
is a collection of RNA editing sites identified from multiple
studies in multiple species (Picardi et al. (2017)). The human (hg38
) and
mouse (mm10
) collections are provided in GRanges objects, in either
coordinate only format, or with additional metadata.
rediportal_coords_hg38()
## GRanges object with 15638648 ranges and 0 metadata columns:
## seqnames ranges strand
## <Rle> <IRanges> <Rle>
## [1] chr1 87158 -
## [2] chr1 87168 -
## [3] chr1 87171 -
## [4] chr1 87189 -
## [5] chr1 87218 -
## ... ... ... ...
## [15638644] chrY 56885715 +
## [15638645] chrY 56885716 +
## [15638646] chrY 56885728 +
## [15638647] chrY 56885841 +
## [15638648] chrY 56885850 +
## -------
## seqinfo: 44 sequences from hg38 genome; no seqlengths
Human CDS
recoding RNA editing sites identified by Gabay et al. (2022) were
formatted into GRanges
objects. These sites were also lifted over to the
mouse genome (mm10
).
cds_sites <- gabay_sites_hg38()
cds_sites[1:4, 1:4]
## GRanges object with 4 ranges and 4 metadata columns:
## seqnames ranges strand | GeneName
## <Rle> <IRanges> <Rle> | <character>
## [1] chr1 999279 - | HES4
## [2] chr1 1014084 + | ISG15
## [3] chr1 1281229 + | SCNN1D
## [4] chr1 1281248 + | SCNN1D
## RefseqAccession_1,ExonNum_1,NucleotideSubstitution_1,AminoAcidSubstitution_1;…;RefseqAccession_N,ExonNum_N,NucleotideSubstitution_N,AminoAcidSubstitution_N
## <character>
## [1] NM_001142467.1,exon3..
## [2] NM_005101.3,exon2,c...
## [3] NM_001130413.3,exon2..
## [4] NM_001130413.3,exon2..
## Syn/NonSyn Diversifying/Restorative/Syn
## <character> <character>
## [1] nonsynonymous NA
## [2] nonsynonymous NA
## [3] synonymous NA
## [4] nonsynonymous NA
## -------
## seqinfo: 23 sequences from hg38 genome; no seqlengths
WGS and RNA-seq BAM and associated files generated from a subset of chromosome 4. Paths to files and related data objects are returned in a list.
NA12878()
## $bams
## BamFileList of length 2
## names(2): NA12878_RNASEQ NA12878_WGS
##
## $fasta
## [1] "/home/biocbuild/.cache/R/ExperimentHub/2728a97176a5a_8469"
##
## $snps
## GRanges object with 380175 ranges and 2 metadata columns:
## seqnames ranges strand | name score
## <Rle> <IRanges> <Rle> | <character> <numeric>
## [1] chr4 10001 * | rs1581341342 0
## [2] chr4 10002 * | rs1581341346 0
## [3] chr4 10004 * | rs1581341351 0
## [4] chr4 10005 * | rs1581341354 0
## [5] chr4 10006 * | rs1209159313 0
## ... ... ... ... . ... ...
## [380171] chr4 999987 * | rs1577536513 0
## [380172] chr4 999989 * | rs948695434 0
## [380173] chr4 999991 * | rs1044698628 0
## [380174] chr4 999996 * | rs1361920394 0
## [380175] chr4 999997 * | rs59206677 0
## -------
## seqinfo: 711 sequences (1 circular) from hg38 genome
RNA-seq BAM files from ADAR1KO and Wild-Type HEK293 cells and associated reference files from chromosome 18 (Chung et al. (2018)).
GSE99249()
## $bams
## BamFileList of length 6
## names(6): SRR5564260 SRR5564261 SRR5564269 SRR5564270 SRR5564271 SRR5564277
##
## $fasta
## [1] "/home/biocbuild/.cache/R/ExperimentHub/2728a93a243bf8_8310"
##
## $sites
## GRanges object with 15638648 ranges and 0 metadata columns:
## seqnames ranges strand
## <Rle> <IRanges> <Rle>
## [1] chr1 87158 -
## [2] chr1 87168 -
## [3] chr1 87171 -
## [4] chr1 87189 -
## [5] chr1 87218 -
## ... ... ... ...
## [15638644] chrY 56885715 +
## [15638645] chrY 56885716 +
## [15638646] chrY 56885728 +
## [15638647] chrY 56885841 +
## [15638648] chrY 56885850 +
## -------
## seqinfo: 44 sequences from hg38 genome; no seqlengths
10x Genomics BAM file and RNA editing sites from chromosome 16 of human PBMC scRNA-seq library. Also included is a SingleCellExperiment object containing gene expression values, cluster annotations, cell-type annotations, and a UMAP projection.
pbmc_10x()
## $bam
## class: BamFile
## path: /home/biocbuild/.cache/R/ExperimentHub/2728a95c6f34ee_8311
## index: /home/biocbuild/.cache/R/ExperimentHub/2728a93bd8f51f_8312
## isOpen: FALSE
## yieldSize: NA
## obeyQname: FALSE
## asMates: FALSE
## qnamePrefixEnd: NA
## qnameSuffixStart: NA
##
## $sites
## GRanges object with 15638648 ranges and 0 metadata columns:
## seqnames ranges strand
## <Rle> <IRanges> <Rle>
## [1] chr1 87158 -
## [2] chr1 87168 -
## [3] chr1 87171 -
## [4] chr1 87189 -
## [5] chr1 87218 -
## ... ... ... ...
## [15638644] chrY 56885715 +
## [15638645] chrY 56885716 +
## [15638646] chrY 56885728 +
## [15638647] chrY 56885841 +
## [15638648] chrY 56885850 +
## -------
## seqinfo: 44 sequences from hg38 genome; no seqlengths
##
## $sce
## class: SingleCellExperiment
## dim: 36601 500
## metadata(2): Samples mkrs
## assays(2): counts logcounts
## rownames(36601): MIR1302-2HG FAM138A ... AC007325.4 AC007325.2
## rowData names(3): ID Symbol Type
## colnames(500): TGTTTGTCAGTTAGGG-1 ATCTCTACAAGCTACT-1 ...
## GGGCGTTTCAGGACGA-1 CTATAGGAGATTGTGA-1
## colData names(8): Sample Barcode ... r celltype
## reducedDimNames(2): PCA UMAP
## mainExpName: NULL
## altExpNames(0):
Alternatively individual files can be accessed from the ExperimentHub directly
library(ExperimentHub)
eh <- ExperimentHub()
raerdata_files <- query(eh, "raerdata")
data.frame(
id = raerdata_files$ah_id,
title = raerdata_files$title,
description = raerdata_files$description
)
Session info
sessionInfo()
## R version 4.4.0 beta (2024-04-15 r86425)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ExperimentHub_2.12.0 AnnotationHub_3.12.0
## [3] BiocFileCache_2.12.0 dbplyr_2.5.0
## [5] SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0
## [7] Biobase_2.64.0 MatrixGenerics_1.16.0
## [9] matrixStats_1.3.0 Rsamtools_2.20.0
## [11] BSgenome.Hsapiens.UCSC.hg38_1.4.5 BSgenome_1.72.0
## [13] BiocIO_1.14.0 Biostrings_2.72.0
## [15] XVector_0.44.0 rtracklayer_1.64.0
## [17] GenomicRanges_1.56.0 GenomeInfoDb_1.40.0
## [19] IRanges_2.38.0 S4Vectors_0.42.0
## [21] BiocGenerics_0.50.0 raerdata_1.2.0
## [23] BiocStyle_2.32.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 dplyr_1.1.4 blob_1.2.4
## [4] filelock_1.0.3 bitops_1.0-7 fastmap_1.1.1
## [7] RCurl_1.98-1.14 GenomicAlignments_1.40.0 XML_3.99-0.16.1
## [10] digest_0.6.35 mime_0.12 lifecycle_1.0.4
## [13] KEGGREST_1.44.0 RSQLite_2.3.6 magrittr_2.0.3
## [16] compiler_4.4.0 rlang_1.1.3 sass_0.4.9
## [19] tools_4.4.0 utf8_1.2.4 yaml_2.3.8
## [22] knitr_1.46 S4Arrays_1.4.0 bit_4.0.5
## [25] curl_5.2.1 DelayedArray_0.30.0 abind_1.4-5
## [28] BiocParallel_1.38.0 withr_3.0.0 purrr_1.0.2
## [31] grid_4.4.0 fansi_1.0.6 cli_3.6.2
## [34] rmarkdown_2.26 crayon_1.5.2 generics_0.1.3
## [37] httr_1.4.7 rjson_0.2.21 DBI_1.2.2
## [40] cachem_1.0.8 zlibbioc_1.50.0 parallel_4.4.0
## [43] AnnotationDbi_1.66.0 BiocManager_1.30.22 restfulr_0.0.15
## [46] vctrs_0.6.5 Matrix_1.7-0 jsonlite_1.8.8
## [49] bookdown_0.39 bit64_4.0.5 jquerylib_0.1.4
## [52] glue_1.7.0 codetools_0.2-20 BiocVersion_3.19.1
## [55] UCSC.utils_1.0.0 tibble_3.2.1 pillar_1.9.0
## [58] rappdirs_0.3.3 htmltools_0.5.8.1 GenomeInfoDbData_1.2.12
## [61] R6_2.5.1 evaluate_0.23 lattice_0.22-6
## [64] png_0.1-8 memoise_2.0.1 bslib_0.7.0
## [67] SparseArray_1.4.0 xfun_0.43 pkgconfig_2.0.3
Chung, Hachung, Jorg J A Calis, Xianfang Wu, Tony Sun, Yingpu Yu, Stephanie L Sarbanes, Viet Loan Dao Thi, et al. 2018. “Human ADAR1 Prevents Endogenous RNA from Triggering Translational Shutdown.” Cell 172 (4): 811–824.e14. https://doi.org/10.1016/j.cell.2017.12.038.
Gabay, Orshay, Yoav Shoshan, Eli Kopel, Udi Ben-Zvi, Tomer D Mann, Noam Bressler, Roni Cohen-Fultheim, et al. 2022. “Landscape of Adenosine-to-Inosine RNA Recoding Across Human Tissues.” Nat. Commun. 13 (1): 1184. https://doi.org/10.1038/s41467-022-28841-4.
Picardi, Ernesto, Anna Maria D’Erchia, Claudio Lo Giudice, and Graziano Pesole. 2017. “REDIportal: A Comprehensive Database of A-to-I RNA Editing Events in Humans.” Nucleic Acids Res. 45 (D1): D750–D757. https://doi.org/10.1093/nar/gkw767.