As including a more detailed vignette inside the package makes the package exceed the tarball size, more detailed vignettes are hosted on an external website. This is a simplified vignette.
This package can be installed from Bioconductor:
if (!requireNamespace("BiocManager")) install.packages("BiocManager")
BiocManager::install("Voyager")
# Devel version
# install.packages("remotes")
remotes::install_github("pachterlab/Voyager")
In non-spatial scRNA-seq, the SingleCellExperiment
(SCE) package implements a data structure and other packages such as scater
implement methods for quality control (QC), basic exploratory data analysis (EDA), and plotting functions, using SCE to organize the data and results. Voyager
to SpatialFeatureExperiment
(SFE) aims to be analogous scater
to SFE, implementing basic exploratory spatial data analysis (ESDA) and plotting. SFE inherits from SCE and SpatialExperiment
(SPE), so all methods written for SCE and SPE can be used for SFE as well.
In this first version, ESDA is based on the classic geospatial package spdep
, but future versions will incorporate methods from GWmodel
, adespatial
, and etc.
These are the main functionalities of the Voyager
at present:
colData
along with annotation geometries, with colorblind friendly default palettes. The actual geometries are plotted, not just centroids as in Seurat
.Future versions will add bivariate and multivariate spatial statistics and user friendly wrappers of some successful spatial transcriptomics data analysis packages for spatially variable genes, cell type deconvolution, and spatial regions on CRAN, Bioconductor, pip, and conda, to provide a uniform syntax and avoid object conversion, as is done in Seurat
for some non-spatial scRNA-seq methods.
Here we use a mouse skeletal muscle Visium dataset from Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration. It’s in the SFEData
package, as an SFE object, which contains Visium spot polygons, myofiber and nuclei segmentations, and myofiber and nuclei morphological metrics.
library(SFEData)
library(SpatialFeatureExperiment)
library(Voyager)
library(scater)
#> Loading required package: SingleCellExperiment
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#>
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#>
#> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#> colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#> colWeightedMeans, colWeightedMedians, colWeightedSds,
#> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#> rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#> rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
#> lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
#> pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
#> tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:base':
#>
#> I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#>
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#>
#> rowMedians
#> The following objects are masked from 'package:matrixStats':
#>
#> anyMissing, rowMedians
#> Loading required package: scuttle
#> Loading required package: ggplot2
sfe <- McKellarMuscleData()
#> snapshotDate(): 2022-10-31
#> see ?SFEData and browseVignettes('SFEData') for documentation
#> loading from cache
# Only use spots in tissue here
sfe <- sfe[,colData(sfe)$in_tissue]
sfe <- logNormCounts(sfe)
sfe
#> class: SpatialFeatureExperiment
#> dim: 15123 932
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(15123): ENSMUSG00000025902 ENSMUSG00000096126 ...
#> ENSMUSG00000064368 ENSMUSG00000064370
#> rowData names(6): Ensembl symbol ... vars cv2
#> colnames(932): AAACATTTCCCGGATT AAACCTAAGCAGCCGG ... TTGTGTTTCCCGAAAG
#> TTGTTGTGTGTCAAGA
#> colData names(13): barcode col ... in_tissue sizeFactor
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : imageX imageY
#> imgData names(1): sample_id
#>
#> Geometries:
#> colGeometries: spotPoly (POLYGON)
#> annotGeometries: tissueBoundary (POLYGON), myofiber_full (POLYGON), myofiber_simplified (POLYGON), nuclei (POLYGON), nuclei_centroid (POINT)
#>
#> Graphs:
#> Vis5A:
A spatial neighborhood graph is required for all spdep
analyses.
colGraph(sfe, "visium") <- findVisiumGraph(sfe)
All of the numerous univariate methods can be used with runUnivariate()
, which stores global results in rowData(sfe)
and local results in localResults(sfe)
. Here we compute Moran’s I for one gene. While Ensembl IDs are used internally, the user can specify more human readable gene symbols. A warning will be given if the gene symbol matches multiple Ensembl IDs.
features_use <- c("Myh1", "Myh2")
sfe <- runUnivariate(sfe, type = "moran", features = features_use,
colGraphName = "visium")
# Look at the results
rowData(sfe)[rowData(sfe)$symbol %in% features_use,]
#> DataFrame with 2 rows and 8 columns
#> Ensembl symbol type means
#> <character> <character> <character> <numeric>
#> ENSMUSG00000033196 ENSMUSG00000033196 Myh2 Gene Expression 0.97476
#> ENSMUSG00000056328 ENSMUSG00000056328 Myh1 Gene Expression 4.82572
#> vars cv2 moran_Vis5A K_Vis5A
#> <numeric> <numeric> <numeric> <numeric>
#> ENSMUSG00000033196 24.0374 25.2984 0.625500 2.21641
#> ENSMUSG00000056328 302.2385 12.9785 0.635718 2.68736
Since Moran’s I is very commonly used, one can call runMoransI
rather than runUnivariate
.
Compute a local spatial statistic, Getis-Ord Gi*, which is commonly used to detect hotspots and coldspots. The include_self
argument is only for Getis-Ord Gi*; when set to TRUE
Gi* is computed as the spatial graph includes self-directing edges, and otherwise Gi is computed.
sfe <- runUnivariate(sfe, type = "localG", features = features_use,
colGraphName = "visium", include_self = TRUE)
# Look at the results
DataFrame(localResults(sfe, name = "localG"))
#> DataFrame with 932 rows and 2 columns
#> ENSMUSG00000056328 ENSMUSG00000033196
#> <numeric> <numeric>
#> 1 0.953965 3.526243
#> 2 2.219643 4.718390
#> 3 -1.450029 -1.086702
#> 4 3.545740 5.322070
#> 5 1.194738 0.282678
#> ... ... ...
#> 928 1.635331 0.204077
#> 929 1.821201 5.413792
#> 930 -2.799465 -2.537571
#> 931 -2.708677 -1.905788
#> 932 0.982869 -1.231641
Spatial statistics can also be computed for numeric columns of colData(sfe)
, with colDataUnivariate()
, and for numeric attributes of the geometries with colGeometryUnivariate()
and annotGeometryUnivariate()
, all with very similar arguments.
Plot gene expression and colData(sfe)
together with annotation geometry. Here nCounts
is the total UMI counts per spot, which is in colData
.
plotSpatialFeature(sfe, c("nCounts", "Myh1"), colGeometryName = "spotPoly",
annotGeometryName = "myofiber_simplified",
aes_use = "color", size = 0.4, fill = NA,
annot_aes = list(fill = "area"))
Plot local results
plotLocalResult(sfe, "localG", features = features_use,
colGeometryName = "spotPoly", divergent = TRUE,
diverge_center = 0)
sessionInfo()
#> R version 4.2.2 (2022-10-31)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so
#> LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] scater_1.26.1 ggplot2_3.4.1
#> [3] scuttle_1.8.4 SingleCellExperiment_1.20.0
#> [5] SummarizedExperiment_1.28.0 Biobase_2.58.0
#> [7] GenomicRanges_1.50.2 GenomeInfoDb_1.34.9
#> [9] IRanges_2.32.0 S4Vectors_0.36.1
#> [11] BiocGenerics_0.44.0 MatrixGenerics_1.10.0
#> [13] matrixStats_0.63.0 Voyager_1.0.10
#> [15] SpatialFeatureExperiment_1.0.3 SFEData_1.0.2
#> [17] BiocStyle_2.26.0
#>
#> loaded via a namespace (and not attached):
#> [1] AnnotationHub_3.6.0 BiocFileCache_2.6.1
#> [3] igraph_1.4.0 sp_1.6-0
#> [5] BiocParallel_1.32.5 digest_0.6.31
#> [7] htmltools_0.5.4 viridis_0.6.2
#> [9] magick_2.7.3 fansi_1.0.4
#> [11] magrittr_2.0.3 memoise_2.0.1
#> [13] ScaledMatrix_1.6.0 SpatialExperiment_1.8.0
#> [15] cluster_2.1.4 limma_3.54.1
#> [17] Biostrings_2.66.0 R.utils_2.12.2
#> [19] colorspace_2.1-0 ggrepel_0.9.3
#> [21] blob_1.2.3 rappdirs_0.3.3
#> [23] xfun_0.37 dplyr_1.1.0
#> [25] crayon_1.5.2 RCurl_1.98-1.10
#> [27] jsonlite_1.8.4 glue_1.6.2
#> [29] gtable_0.3.1 zlibbioc_1.44.0
#> [31] XVector_0.38.0 DelayedArray_0.24.0
#> [33] scico_1.3.1 BiocSingular_1.14.0
#> [35] DropletUtils_1.18.1 Rhdf5lib_1.20.0
#> [37] HDF5Array_1.26.0 scales_1.2.1
#> [39] DBI_1.1.3 edgeR_3.40.2
#> [41] Rcpp_1.0.10 viridisLite_0.4.1
#> [43] xtable_1.8-4 spData_2.2.1
#> [45] units_0.8-1 dqrng_0.3.0
#> [47] rsvd_1.0.5 bit_4.0.5
#> [49] spdep_1.2-7 proxy_0.4-27
#> [51] httr_1.4.4 RColorBrewer_1.1-3
#> [53] wk_0.7.1 ellipsis_0.3.2
#> [55] farver_2.1.1 pkgconfig_2.0.3
#> [57] R.methodsS3_1.8.2 sass_0.4.5
#> [59] dbplyr_2.3.0 deldir_1.0-6
#> [61] locfit_1.5-9.7 utf8_1.2.3
#> [63] labeling_0.4.2 tidyselect_1.2.0
#> [65] rlang_1.0.6 later_1.3.0
#> [67] AnnotationDbi_1.60.0 munsell_0.5.0
#> [69] BiocVersion_3.16.0 tools_4.2.2
#> [71] cachem_1.0.6 cli_3.6.0
#> [73] dbscan_1.1-11 generics_0.1.3
#> [75] RSQLite_2.3.0 ExperimentHub_2.6.0
#> [77] evaluate_0.20 fastmap_1.1.0
#> [79] yaml_2.3.7 knitr_1.42
#> [81] bit64_4.0.5 purrr_1.0.1
#> [83] s2_1.1.2 KEGGREST_1.38.0
#> [85] sparseMatrixStats_1.10.0 mime_0.12
#> [87] R.oo_1.25.0 compiler_4.2.2
#> [89] beeswarm_0.4.0 filelock_1.0.2
#> [91] curl_5.0.0 png_0.1-8
#> [93] interactiveDisplayBase_1.36.0 e1071_1.7-13
#> [95] tibble_3.1.8 bslib_0.4.2
#> [97] highr_0.10 lattice_0.20-45
#> [99] bluster_1.8.0 Matrix_1.5-3
#> [101] classInt_0.4-8 vctrs_0.5.2
#> [103] pillar_1.8.1 lifecycle_1.0.3
#> [105] rhdf5filters_1.10.0 BiocManager_1.30.19
#> [107] jquerylib_0.1.4 BiocNeighbors_1.16.0
#> [109] irlba_2.3.5.1 bitops_1.0-7
#> [111] httpuv_1.6.9 patchwork_1.1.2
#> [113] R6_2.5.1 bookdown_0.32
#> [115] promises_1.2.0.1 gridExtra_2.3
#> [117] KernSmooth_2.23-20 vipor_0.4.5
#> [119] codetools_0.2-19 boot_1.3-28.1
#> [121] assertthat_0.2.1 rhdf5_2.42.0
#> [123] rjson_0.2.21 withr_2.5.0
#> [125] GenomeInfoDbData_1.2.9 parallel_4.2.2
#> [127] grid_4.2.2 beachmat_2.14.0
#> [129] class_7.3-21 rmarkdown_2.20
#> [131] DelayedMatrixStats_1.20.0 ggnewscale_0.4.8
#> [133] sf_1.0-9 shiny_1.7.4
#> [135] ggbeeswarm_0.7.1