TENxVisiumData 1.0.2
The TENxVisiumData
package provides an R/Bioconductor resource for
Visium spatial gene expression datasets by 10X Genomics. The package currently includes 13 datasets from 23 samples across two organisms (human and mouse) and 13 tissues:
A list of currently available datasets can be obtained using the ExperimentHub
interface:
library(ExperimentHub)
eh <- ExperimentHub()
(q <- query(eh, "TENxVisium"))
## ExperimentHub with 13 records
## # snapshotDate(): 2021-05-18
## # $dataprovider: 10X Genomics
## # $species: Homo sapiens, Mus musculus
## # $rdataclass: SpatialExperiment
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH6731"]]'
##
## title
## EH6731 | HumanBreastCancerIDC_v3.13
## EH6732 | HumanBreastCancerILC_v3.13
## EH6733 | HumanCerebellum_v3.13
## EH6734 | HumanColorectalCancer_v3.13
## EH6735 | HumanGlioblastoma_v3.13
## ... ...
## EH6739 | HumanSpinalCord_v3.13
## EH6740 | MouseBrainCoronal_v3.13
## EH6741 | MouseBrainSagittalPosterior_v3.13
## EH6742 | MouseBrainSagittalAnterior_v3.13
## EH6743 | MouseKidneyCoronal_v3.13
To retrieve a dataset, we can use a dataset’s corresponding named function <id>()
, where <id>
should correspond to one a valid dataset identifier (see ?TENxVisiumData
). E.g.:
library(TENxVisiumData)
spe <- HumanHeart_v3.13()
Alternatively, data can loaded directly from Bioconductor’s ExerimentHub as follows. First, we initialize a hub instance and store the complete list of records in a variable eh
. Using query()
, we then identify any records made available by the TENxVisiumData
package, as well as their accession IDs (EH1234). Finally, we can load the data into R via eh[[id]]
, where id
corresponds to the data entry’s identifier we’d like to load. E.g.:
library(ExperimentHub)
eh <- ExperimentHub() # initialize hub instance
q <- query(eh, "TENxVisium") # retrieve 'TENxVisiumData' records
id <- q$ah_id[1] # specify dataset ID to load
spe <- eh[[id]] # load specified dataset
Each dataset is provided as a SpatialExperiment (SPE), which extends the SingleCellExperiment (SCE) class with features specific to spatially resolved data:
spe
## class: SpatialExperiment
## dim: 36601 7785
## metadata(0):
## assays(1): counts
## rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
## ENSG00000277196
## rowData names(1): symbol
## colnames(7785): AAACAAGTATCTCCCA-1 AAACACCAATAACTGC-1 ...
## TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
## colData names(1): sample_id
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialData names(3) : in_tissue array_row array_col
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor
For details on the SPE class, we refer to the package’s vignette. Briefly, the SPE harbors the following data in addition to that stored in a SCE:
spatialCoords
; a numeric matrix of spatial coordinates, stored inside the object’s int_colData
:
head(spatialCoords(spe))
## pxl_col_in_fullres pxl_row_in_fullres
## AAACAAGTATCTCCCA-1 15937 17428
## AAACACCAATAACTGC-1 18054 6092
## AAACAGAGCGACTCCT-1 7383 16351
## AAACAGGGTCTATATT-1 15202 5278
## AAACAGTGTTCCTGGG-1 21386 9363
## AAACATTTCCCGGATT-1 18549 16740
spatialData
; a DFrame
of spatially-related sample metadata, stored as part of the object’s colData
. This colData
subset is in turn determined by the int_metadata
field spatialDataNames
:
head(spatialData(spe))
## DataFrame with 6 rows and 3 columns
## in_tissue array_row array_col
## <logical> <integer> <integer>
## AAACAAGTATCTCCCA-1 TRUE 50 102
## AAACACCAATAACTGC-1 TRUE 59 19
## AAACAGAGCGACTCCT-1 TRUE 14 94
## AAACAGGGTCTATATT-1 TRUE 47 13
## AAACAGTGTTCCTGGG-1 TRUE 73 43
## AAACATTTCCCGGATT-1 TRUE 61 97
imgData
; a DFrame
containing image-related data, stored inside the int_metadata
:
imgData(spe)
## DataFrame with 2 rows and 4 columns
## sample_id image_id data scaleFactor
## <character> <character> <list> <numeric>
## 1 HumanBreastCancerIDC.. lowres #### 0.0247525
## 2 HumanBreastCancerIDC.. lowres #### 0.0247525
Datasets with multiple sections are consolidated into a single SPE with colData
field sample_id
indicating each spot’s sample of origin. E.g.:
spe <- MouseBrainSagittalAnterior_v3.13()
table(spe$sample_id)
##
## MouseBrainSagittalAnterior_v3.131 MouseBrainSagittalAnterior_v3.132
## 2695 2825
Datasets of targeted analyses are provided as a nested SPE, with whole transcriptome measurements as primary data, and those obtained from targeted panels as altExp
s. E.g.:
spe <- HumanOvarianCancer_v3.13()
altExpNames(spe)
## [1] "TargetedImmunology" "TargetedPanCancer"
sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] TENxVisiumData_1.0.2 SpatialExperiment_1.2.1
## [3] SingleCellExperiment_1.14.1 SummarizedExperiment_1.22.0
## [5] Biobase_2.52.0 GenomicRanges_1.44.0
## [7] GenomeInfoDb_1.28.1 IRanges_2.26.0
## [9] S4Vectors_0.30.0 MatrixGenerics_1.4.2
## [11] matrixStats_0.60.0 ExperimentHub_2.0.0
## [13] AnnotationHub_3.0.1 BiocFileCache_2.0.0
## [15] dbplyr_2.1.1 BiocGenerics_0.38.0
## [17] BiocStyle_2.20.2
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-7 bit64_4.0.5
## [3] filelock_1.0.2 httr_1.4.2
## [5] tools_4.1.1 bslib_0.2.5.1
## [7] utf8_1.2.2 R6_2.5.1
## [9] HDF5Array_1.20.0 DBI_1.1.1
## [11] rhdf5filters_1.4.0 withr_2.4.2
## [13] tidyselect_1.1.1 bit_4.0.4
## [15] curl_4.3.2 compiler_4.1.1
## [17] DelayedArray_0.18.0 bookdown_0.23
## [19] sass_0.4.0 rappdirs_0.3.3
## [21] stringr_1.4.0 digest_0.6.27
## [23] rmarkdown_2.10 R.utils_2.10.1
## [25] XVector_0.32.0 pkgconfig_2.0.3
## [27] htmltools_0.5.1.1 sparseMatrixStats_1.4.2
## [29] limma_3.48.3 fastmap_1.1.0
## [31] rlang_0.4.11 RSQLite_2.2.7
## [33] shiny_1.6.0 DelayedMatrixStats_1.14.2
## [35] jquerylib_0.1.4 generics_0.1.0
## [37] jsonlite_1.7.2 BiocParallel_1.26.2
## [39] dplyr_1.0.7 R.oo_1.24.0
## [41] RCurl_1.98-1.4 magrittr_2.0.1
## [43] scuttle_1.2.1 GenomeInfoDbData_1.2.6
## [45] Matrix_1.3-4 Rcpp_1.0.7
## [47] Rhdf5lib_1.14.2 fansi_0.5.0
## [49] lifecycle_1.0.0 R.methodsS3_1.8.1
## [51] edgeR_3.34.0 stringi_1.7.3
## [53] yaml_2.2.1 zlibbioc_1.38.0
## [55] rhdf5_2.36.0 grid_4.1.1
## [57] blob_1.2.2 promises_1.2.0.1
## [59] dqrng_0.3.0 crayon_1.4.1
## [61] lattice_0.20-44 Biostrings_2.60.2
## [63] beachmat_2.8.1 KEGGREST_1.32.0
## [65] magick_2.7.3 locfit_1.5-9.4
## [67] knitr_1.33 pillar_1.6.2
## [69] rjson_0.2.20 glue_1.4.2
## [71] BiocVersion_3.13.1 evaluate_0.14
## [73] BiocManager_1.30.16 png_0.1-7
## [75] vctrs_0.3.8 httpuv_1.6.2
## [77] purrr_0.3.4 assertthat_0.2.1
## [79] cachem_1.0.6 xfun_0.25
## [81] DropletUtils_1.12.2 mime_0.11
## [83] xtable_1.8-4 later_1.3.0
## [85] tibble_3.1.3 AnnotationDbi_1.54.1
## [87] memoise_2.0.0 ellipsis_0.3.2
## [89] interactiveDisplayBase_1.30.0