1 Introduction

1.1 Overview

The primary utility of the spatialHeatmap package is the generation of spatial heatmaps (SHM) for visualizing cell-, tissue- and organ-specific abundance patterns of biological molecules (e.g. RNAs) in anatomical images (Zhang et al. 2022). This is useful for identifying molecules with spatially enriched (SE) abundance patterns as well as clusters and/or network modules composed of molecules sharing similar abundance patterns such as similar gene expression patterns. These functionalities are introduced in the main vignette of the spatialHeatmap package. The following describes extended functionalities for integrating tissue with single cell data by co-visualizing them in composite plots that combine spatial heatmaps with embedding plots of high-dimensional data. The resulting spatial context information is important for gaining insights into the tissue-level organization of single cell data.

The required quantitative assay data, such as gene expression values, can be provided in a variety of widely used tabular data structures (e.g. data.frame, SummarizedExperiment or SingleCellExperiment). The corresponding anatomic images need to be supplied as annotated SVG (aSVG) images and can be stored in a specific S4 class SVG. The creation of aSVGs is described in the main vignette of this package. For the embedding plots of single cell data, several dimensionality reduction algorithms (e.g. PCA, UMAP or tSNE) are supported. To associate single cells with their source tissues, the user can choose among three major methods including annotation-based, manual and automated methods (Figure 1). Similar to other functionalities in spatialHeatmap, these functionalities are available within R as well as the corresponding Shiny app (Chang et al. 2021).

1.2 Methods for Associating Cells and Bulk Tissues

To co-visualize single cell data with tissue features (Figure 1), the individual cells of the single cell data are mapped via their group labels to the corresponding tissue features in an aSVG image. If the feature labels in an aSVG are different than the corresponding group labels used for the single cell data, e.g. due to variable terminologies, a translation map can be used to avoid manual relabelling. Throughout this vignette the usage of the term feature is a generalization referring in most cases to tissues or organs. For the implementation of the co-visualization tool, spatialHeatmap takes advantage of efficient and reusable S4 classes for both assay data and aSVGs respectively. The former includes the Bioconductor core data structures such as the widely used SingleCellExperiment (SCE) container illustrated in Figure 1.1 (Amezquita et al. 2020). The slots assays, colData, rowData and reducedDims in an SCE contain expression data, cell metadata, molecule metadata and reduced dimensionality embedding results, respectively. The cell group labels are stored in the colData slot as shown in Figure 1.1. The S4 class SVG (Figure 1.3) is developed specifically in spatialHeatmap for storing aSVG instances. The two most important slots coordinate and attribute stores the aSVG feature coordinates and respective attributes such as fill color, line withs, etc. respectively, while other slots dimension, svg, and raster stores image dimension, aSVG file paths, and raster image paths respectively. For handling cell-to-tissue grouping information, three general methods are available including (a) annotation-based, (b) manual and (c) automated. The annotation-based and manual methods are similar by using known cell group labels. The main difference is how the cell labels are provided. In the annotation-based method, existing group labels are available and can be uploaded and/or stored in the SCE object, as is the case in some of the SCE instances provided by the scRNAseq package (Risso and Cole 2022). The manual method allows users to create the cell to tissue associations one-by-one or import them from a tabular file. In contrast to this, the automated method aims to assign single cells to the corresponding source tissues computationally by a co-clustering algorithm (Figure 8). This co-clustering is experimental and requires bulk expression data that are obtained from the tissues represented in the single cell data. The grouping information is visualized by using for each group the same color in both the single cell embedding plot and the tissue spatial heatmap plot (Figure 1.5). The colors can represent any type of custom or numeric information. In a typical use case, either fixed tissue-specific colors or a heat color gradient is used that is proportional to the numeric expression information obtained from the single cell or bulk expression data of a chosen gene. When the expression values among groups are very similar, toggling between the two coloring option is important to track the tissue origin in the single cell data. To color by single cell data, one often wants to first summarize the expression of a given gene across the cells within each group via a meaningful summary statistics, such as mean or median. Cells and tissues with the same group label will be colored the same. When coloring by tissues the color used for each tissue feature will be applied to the corresponding cell groups represented in the embedding plot.

Overview of single cell and bulk tissue co-visualization. (1) Single cell data are organized in a `SingleCellExperiment` object where cell grouping information is stored in the `colData` slot. (2) Bulk tissue expression data are stored in a `SummarizedExperiment` object, where tissues are labelled by aSVG feature identifiers (3) in the `colData` slot. (3) The aSVG instance is stored in an `SVG` class. The `coordinate` slot defines spatial feature shapes in spatial heatmap (5) while the `attribute` slot defines styling of these features such as color, line with, etc. (4) Bi-directional mapping between cell or tissue groups are supported. Cell groups and tissues can be colored by fixed colors or by expression values of selected genes. (5) Cells in (1) and bulk tissues in (2) are co-visualized in and embedding plot and spatial heatmap plot respectively, where mapped cell groups and respective tissues are indicated by same colors.

Figure 1: Overview of single cell and bulk tissue co-visualization
(1) Single cell data are organized in a SingleCellExperiment object where cell grouping information is stored in the colData slot. (2) Bulk tissue expression data are stored in a SummarizedExperiment object, where tissues are labelled by aSVG feature identifiers (3) in the colData slot. (3) The aSVG instance is stored in an SVG class. The coordinate slot defines spatial feature shapes in spatial heatmap (5) while the attribute slot defines styling of these features such as color, line with, etc. (4) Bi-directional mapping between cell or tissue groups are supported. Cell groups and tissues can be colored by fixed colors or by expression values of selected genes. (5) Cells in (1) and bulk tissues in (2) are co-visualized in and embedding plot and spatial heatmap plot respectively, where mapped cell groups and respective tissues are indicated by same colors.

2 Getting Started

2.1 Installation

The spatialHeatmap package can be installed with the BiocManager::install command.

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("spatialHeatmap")

2.2 Packages and Documentation

Next, the packages required for running the sample code in this vignette need to be loaded.

library(spatialHeatmap); library(SummarizedExperiment); library(scran); library(scater); library(igraph); library(SingleCellExperiment); library(BiocParallel)
library(kableExtra)

The following lists the vignette(s) of this package in an HTML browser. Clicking the name of the corresponding vignette will open it.

browseVignettes('spatialHeatmap')

To reduce runtime, intermediate results can be cached under ~/.cache/shm.

cache.pa <- '~/.cache/shm' # Set path of the cache directory

3 Quick Start

To obtain for examples with randomized data or parameters always the same results, a fixed seed is set.

set.seed(10)

This quick start example is demonstrated on cell-to-tissue mapping by using a single cell data set from oligodendrocytes of mouse brain (Marques et al. 2016). This data set is obtained from the scRNAseq (Risso and Cole 2022) package with minor modificatons, which is included in spatialHeatmap.

Prior to co-visualization the single cell data is pre-processed by the process_cell_meta function that applies common QC, normalization and dimensionality reduction routines. The details of these pre-processing methods are described in the corresponding help file. Additional background information on these topics can be found in the OSCA tutorial.

sce.pa <- system.file("extdata/shinyApp/example", "cell_mouse_brain.rds", package="spatialHeatmap")
sce <- readRDS(sce.pa)
sce.dimred.quick <- process_cell_meta(sce, qc.metric=list(subsets=list(Mt=rowData(sce)$featureType=='mito'), threshold=1))
colData(sce.dimred.quick)[1:3, 1:2]
## DataFrame with 3 rows and 2 columns
##                           label         age
##                     <character> <character>
## C1.1772078.029.F11 hypothalamus         p22
## C1.1772089.202.E04       SN.VTA         p22
## C1.1772099.091.D10  dorsal.horn         p20

To visualize single cell data in spatial heatmaps, the single cell expression values are summarized using common summary statistics such as mean or median by their source tissue grouping, which is in the label column of colData slot.

sce.aggr.quick <- aggr_rep(sce.dimred.quick, assay.na='logcounts', sam.factor='label', aggr='mean')

The summarized values are then used to color the tissue features of the corresponding aSVG file of mouse brain that is included in the spatialHeatmap package. The function read_svg is used for parsing the aSVG and storing it in an SVG object svg.mus.brain.

svg.mus.brain.pa <- system.file("extdata/shinyApp/example", "mus_musculus.brain.svg", package="spatialHeatmap")
svg.mus.brain <- read_svg(svg.mus.brain.pa)

A subset of features and related attributes are returned from svg.mus.brain, where fill and stroke refer to color and line width respectively.

tail(attribute(svg.mus.brain)[[1]])[, 1:4]
## # A tibble: 6 × 4
##   feature                      id             fill  stroke
##   <chr>                        <chr>          <chr>  <dbl>
## 1 brainstem                    UBERON_0002298 none    0.05
## 2 midbrain                     UBERON_0001891 none    0.05
## 3 dorsal.plus.ventral.thalamus UBERON_0001897 none    0.05
## 4 hypothalamus                 UBERON_0001898 none    0.05
## 5 nose                         UBERON_0000004 none    0.05
## 6 corpora.quadrigemina         UBERON_0002259 none    0.05

To map cell group labels to aSVG features, a list with named components is used, where cell labels are in name slots. Note, each cell label can be matched to multiple aSVG features but not vice versa.

lis.match.quick <- list(hypothalamus=c('hypothalamus'), cortex.S1=c('cerebral.cortex', 'nose'))

The co-visualization is created on gene Eif5b using the function covis. The embedding plot includes all cells before aggregation. The hypothalamus and cortex.S1 cells are colored according to their respecitive aggregated expression values in data=sce.aggr.quick. In the nearby spatial heatmap plot, aSVG features are filled by the same color as the matching cells defined in lis.match.quick. The cell.group argument indicates cell group labels in the colData slot of sce.aggr.quick, tar.cell suggests the target cell groups to visualize, and dimred specifies the embedding plot.

shm.lis.quick <- covis(svg=svg.mus.brain, data=sce.aggr.quick, ID=c('Eif5b'), sce.dimred=sce.dimred.quick, dimred='PCA', cell.group='label', tar.cell=names(lis.match.quick), lis.rematch=lis.match.quick, assay.na='logcounts', bar.width=0.11, dim.lgd.nrow=1, height=0.7, legend.r=1.5, legend.key.size=0.02, legend.text.size=12, legend.nrow=3) 
Co-visualization of cell-to-tissue mapping. The co-visualization is created on gene `Eif5b`. Single cells in the embedding plot and their matching aSVG features in the spatial heatmap are filled by the same color according to aggregated expression values of `Eif5b` in single cell data.

Figure 2: Co-visualization of cell-to-tissue mapping
The co-visualization is created on gene Eif5b. Single cells in the embedding plot and their matching aSVG features in the spatial heatmap are filled by the same color according to aggregated expression values of Eif5b in single cell data.

4 Mappings between cells and tissues

4.1 Annotation-based

The annotation-based mapping uses existing cell-to-tissue labels. Those can be imported (e.g. from a tabular file) and then stored in the colData slot of an SCE object. Alternatively, an SCE object might already contain this information. This is the case in some of the SCE objects provided by the scRNAseq package (Risso and Cole 2022).

The Quick Start is domenstrated using the annotation-based method, since the single cell data are downloaded from scRNAseq package. In this example, the tissue-to-cell mapping will be showcased. The single cell data and aSVG file are the same as the Quick Start, and the bulk data are modified from a research on mouse cerebellar development(Vacher et al. 2021).

The bulk count data are imported in a SummarizedExperiment and partial are shown. Note, replicates are indicated by the same sample names (e.g. cerebral.cortex).

blk.mus.pa <- system.file("extdata/shinyApp/example", "bulk_mouse_cocluster.rds", package="spatialHeatmap") 
blk.mus <- readRDS(blk.mus.pa)
assay(blk.mus)[1:3, 1:5]
##          cerebral.cortex hippocampus hypothalamus cerebellum cerebral.cortex
## AI593442             177         256           50         24             285
## Actr3b               513        1465          228        244             666
## Adcy1                701        1243           57       1910             836
colData(blk.mus)[1:3, , drop=FALSE]
## DataFrame with 3 rows and 1 column
##                          tissue
##                     <character>
## cerebral.cortex cerebral.cortex
## hippocampus         hippocampus
## hypothalamus       hypothalamus

The bulk count data are normalized using the ESF option, which refers to estimateSizeFactors from DESeq2 (Love, Huber, and Anders 2014). The expression values for each gene are summarized by mean across replicates (here aggr='mean').

# Normalization.
blk.mus.nor <- norm_data(data=blk.mus, norm.fun='ESF', log2.trans=TRUE)
# Aggregation.
blk.mus.aggr <- aggr_rep(blk.mus.nor, sam.factor='tissue', aggr='mean')
assay(blk.mus.aggr)[1:2, ]

The single cell data from the Quick Start are used in this example. Its metadata in colData are partially shown.

colData(sce.dimred.quick)[1:3, 1:2]
## DataFrame with 3 rows and 2 columns
##                           label         age
##                     <character> <character>
## C1.1772078.029.F11 hypothalamus         p22
## C1.1772089.202.E04       SN.VTA         p22
## C1.1772099.091.D10  dorsal.horn         p20

The single cell data are plotted at the UMAP dimensions using plot_dim, where cells are represented by dots and are colored by the grouping information stored in the colData slot of the SCE object, here color.by="label".

plot_dim(sce.dimred.quick, color.by="label", dim='UMAP')
Embedding plot of single cell data. The cells (dots) are colored by the grouping information stored in the `colData` slot of the corresponding `SCE` object

Figure 3: Embedding plot of single cell data
The cells (dots) are colored by the grouping information stored in the colData slot of the corresponding SCE object

The aSVG instance of mouse brain from the Quick Start are used. Partial of the aSVG features are shown.

tail(attribute(svg.mus.brain)[[1]])[1:3, 1:4]
## # A tibble: 3 × 4
##   feature                      id             fill  stroke
##   <chr>                        <chr>          <chr>  <dbl>
## 1 brainstem                    UBERON_0002298 none    0.05
## 2 midbrain                     UBERON_0001891 none    0.05
## 3 dorsal.plus.ventral.thalamus UBERON_0001897 none    0.05

Same with conventions in the main vignette, at least one bulk tissue identifier should match with aSVG features so as to successfully plot spatial heatmaps.

colnames(blk.mus) %in% attribute(svg.mus.brain)[[1]]$feature
##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

In practice the labels used for features in the aSVG image and the cell group labels of the single cell data may not be the same. To resolve this without manual relabeling, a translation list is used to make them match. In tissue-to-cell mapping, the feature and cell labels are expected to be the names and elements of the list components, respectively.

lis.match.blk <- list(cerebral.cortex=c('cortex.S1'), hypothalamus=c('corpus.callosum', 'hypothalamus'))

The following plots the corresponding co-visualization for sample gene Actr3b. The legend under the embedding plot shows the cell labels in the matching list (lis.match.blk). The source tissue information is indicated by using the same colors in the embedding and tissue plots on the left and right, respectively. In contrast to the Quick Start, the tar.bulk indicates target bulk tissues to show.

covis(svg=svg.mus.brain, data=blk.mus.aggr, ID=c('Actr3b'), sce.dimred=sce.dimred.quick, dimred='UMAP', cell.group='label', tar.bulk=names(lis.match.blk), lis.rematch=lis.match.blk, bar.width=0.09, dim.lgd.nrow=2, dim.lgd.text.size=12, height=0.7, legend.r=1.5, legend.key.size=0.02, legend.text.size=12, legend.nrow=2)
Co-visualization plot through tissue-to-cell mapping. In this plot, `Actr3b` is used as an example. Each cell population is colored by its summarized expression value in the embedding plot on the left, and the corresponding tissue shares the same color in the spatial heatmap on the right.

Figure 4: Co-visualization plot through tissue-to-cell mapping
In this plot, Actr3b is used as an example. Each cell population is colored by its summarized expression value in the embedding plot on the left, and the corresponding tissue shares the same color in the spatial heatmap on the right.

In scenarios where expression values are similar across tissues, the mapping between cells and tissues can be indicated by fixed contrasting colors by setting profile=FALSE.

covis(svg=svg.mus.brain, data=blk.mus.aggr, ID=c('Actr3b'), profile=FALSE, sce.dimred=sce.dimred.quick, dimred='UMAP', cell.group='label', lis.rematch=lis.match.blk, tar.bulk=names(lis.match.blk), bar.width=0.09, dim.lgd.nrow=2, dim.lgd.text.size=12, height=0.7, legend.r=1.5, legend.key.size=0.02, legend.text.size=12, legend.nrow=2)
Co-visualization plot through tissue-to-cell mapping without expression values. In this plot, mapping beween cell groups and tissues are indicated by fixed colors instead of expression values.

Figure 5: Co-visualization plot through tissue-to-cell mapping without expression values
In this plot, mapping beween cell groups and tissues are indicated by fixed colors instead of expression values.

4.2 Manual Method

To provide additional flexibility for defining cell groupings, several manual options are provided. Here users can assign cell groups manually or by clustering methods for single cell embedding data that are often used in the analysis of single cell data. The resulting cell grouping or cluster information needs to be stored in a tabular file, that will be imported into an SCE object (here manual_group function). The following demonstration uses the same single cell and aSVG instance as the annotation example above. The only difference is an additional clustering step. For demonstration purposes a small example of a cluster file is included in the spatialHeatmap package. In this case the group labels were created by the cluster_cell function. The details of this function are available in its help file. The cluster file contains at least two columns: a column (here cell) with single cell identifiers used under colData and a column (here cluster) with the cell group labels. For practical reasons of building this vignette a pure manual example could not be used here. However, the chosen clustering example can be easily adapted to manual or hybrid grouping approaches since the underlying tabular data structure is the same for both that can be generated in most text or spreadsheet programs.

manual.clus.mus.sc.pa <- system.file("extdata/shinyApp/example", "manual_cluster_mouse_brain.txt", package="spatialHeatmap")
manual.clus.mus.sc <- read.table(manual.clus.mus.sc.pa, header=TRUE, sep='\t')
manual.clus.mus.sc[1:3, ]
##                 cell cluster
## 1 C1.1772078.029.F11   clus7
## 2 C1.1772089.202.E04   clus7
## 3 C1.1772099.091.D10   clus1

The manual_group function can be used to append the imported group labels to the colData slot of an SCE object without interfering with other functions and methods operating on SCE objects.

sce.clus <- manual_group(sce=sce.dimred.quick, df.group=manual.clus.mus.sc, cell='cell', cell.group='cluster')
colData(sce.clus)[1:3, c('cluster', 'label', 'expVar')]
## DataFrame with 3 rows and 3 columns
##                        cluster        label      expVar
##                    <character>  <character> <character>
## C1.1772078.029.F11       clus7 hypothalamus     control
## C1.1772089.202.E04       clus7       SN.VTA     control
## C1.1772099.091.D10       clus1  dorsal.horn     control

An embedding plot of single cell data is created. The cells represented as dots are colored by the grouping information stored in the cluster column of the colData slot of SCE.

plot_dim(sce.clus, color.by="cluster", dim='UMAP')
Embedding plot of single cells. The cells (dots) are colored by the grouping information stored in the `colData` slot of the corresponding `SCE` object .

Figure 6: Embedding plot of single cells
The cells (dots) are colored by the grouping information stored in the colData slot of the corresponding SCE object .

The same mouse brain aSVG as above is used here.

tail(attribute(svg.mus.brain)[[1]])[1:3, 1:4]
## # A tibble: 3 × 4
##   feature                      id             fill  stroke
##   <chr>                        <chr>          <chr>  <dbl>
## 1 brainstem                    UBERON_0002298 none    0.05
## 2 midbrain                     UBERON_0001891 none    0.05
## 3 dorsal.plus.ventral.thalamus UBERON_0001897 none    0.05

Similarly as above, a mapping list is used to match the cell clusters with aSVG features.

lis.match.clus <- list(clus1=c('cerebral.cortex'), clus3=c('brainstem', 'medulla.oblongata'))

This example is demonstrated through cell-to-tissue mapping, so gene expression values need to be summarized for the cells within each group label. Any grouping column in colData can be used as labels for summarizing. In this manual method, the cluster labels are chosen.

If additional experimental variables are provided, the summarizing will consider them as well (here expVar). The following example uses the clsuter and expVar columns as group labels and experimental variables, respectively.

sce.clus.aggr <- aggr_rep(sce.clus, assay.na='logcounts', sam.factor='cluster', con.factor='expVar', aggr='mean')
colData(sce.clus.aggr)[1:3, c('cluster', 'label', 'expVar')]
## DataFrame with 3 rows and 3 columns
##                    cluster           label      expVar
##                <character>     <character> <character>
## clus7__control       clus7    hypothalamus     control
## clus1__control       clus1     dorsal.horn     control
## clus5__control       clus5 corpus.callosum     control

The co-visualization is plotted for gene Tcea1. In this example the coloring is based on the gene expression summary for each cell group obtained by clustering. Completely manual groupings can be provided the same way.

covis(svg=svg.mus.brain, data=sce.clus.aggr, ID=c('Tcea1'), sce.dimred=sce.clus, dimred='UMAP', cell.group='cluster', assay.na='logcounts', tar.cell=names(lis.match.clus), lis.rematch=lis.match.clus, bar.width=0.09, dim.lgd.nrow=1, height=0.7, legend.r=1.5, legend.key.size=0.02, legend.text.size=12, legend.nrow=3)
Co-visualization of single cell data and tissue features with cluster groupings. Gene `Tcea1` is used as an example and the cell groupings were obtained by clustering.

Figure 7: Co-visualization of single cell data and tissue features with cluster groupings
Gene Tcea1 is used as an example and the cell groupings were obtained by clustering.

4.3 Automated Method

If both single cell and bulk gene expression data are available for the same or overlapping tissues then co-clustering can be used to assign cells to tissues automatically (Figure 8). Subsequently, the predicted cell-to-tissue assignments can be used for co-visualizing single cell embedding data with tissue level bulk data. This approach is useful for predicting the source tissues of unassigned cells without prior knowledge as is required for the annotation and manual approaches introduced above. While attractive there are various challenges to overcome to reliably co-cluster single cell data with the corresponding tissue-level bulk data. This is due to the different properties of single cell and bulk gene expression data, such as lower sensitivity and higher sparsity in single cell compared to bulk data. This section introduces a co-clustering method that is largely based on parameter optimization including three major steps. First, both data are preprocessed to retain the most reliable expression values (Figure 8.1a-b). Second, the genes in the bulk data are reduced to those robustly expressed in the single cell data (Figure 8.1c). Third, bulk and cell data are co-clustered by using optimal default settings (Table 1) that are obtained through optimization on real data with known cell-to-tissue assignments. The following introduces the three steps of this method in more detail using the example of RNA-Seq data.

  1. The raw count matrices of bulk and single cells are column-wise combined for joint normalization (Figure 8.1a). After separated from bulk data, the single cell data are reduced to genes with robust expression across X% cells and to cells with robust expression across Y% genes (Figure 8.1b). In the bulk data, genes are filtered according to expression values \(\ge\) A at a proportion of \(\ge\) p across bulk samples and a coefficient of variance (CV) between CV1 and CV2 (Figure 8.1b).

  2. The bulk data are subsetted to the same genes as the single cell data (Figure 8.1c). This and the previous filtering steps in single cell data reduce the sparsity in the single cell data and the bulk data are made more compareable to the single cell data by subsetting it to the same genes.

  3. Bulk and single cell data are column-wise combined for joint embedding using PCA (UMAP, or other). Co-clustering is performed on the embedding data and three types of clusters are produced. First, only one bulk tissue is co-clustered with cells ((Figure 8.2a). This bulk is assigned to all cells in the same cocluster. Second, multiple bulk tissues are co-clustered with multiple cells (Figure 8.2b). The bulk tissues are assigned according to the nearest neighbor bulk of each cell, which is measured by Spearman’s correlation coefficients. Third, no bulk tissue is co-clustered with cells (Figure 8.2c). All these cells are un-labeled, which are candidates for discovering novel cell types.

After co-clustering, cells are labeled by bulk tissues or un-labeled (Figure 8.3) and these labels are used for co-visualization (Figure 8.4), where cells in embedding plot and matching tissues in spatial heatmap are filled by same colors.

Overview of co-clustering. (1) Raw count data are pre-processed. (2) Co-clustering is performed on embedding dimensions of combined bulk and single cell data. (3) Cells are labeled by bulk assignments. (4) Cells and bulk tissues are co-visualized according to (3).

Figure 8: Overview of co-clustering
(1) Raw count data are pre-processed. (2) Co-clustering is performed on embedding dimensions of combined bulk and single cell data. (3) Cells are labeled by bulk assignments. (4) Cells and bulk tissues are co-visualized according to (3).

After revising the method outline for the co-clustering please revise the figure 7 above accordingly. The legend can just refer to the main text to avoid duplication. PLEASE keep it simple and clear. If a figure raises more questions than it answers then it defeats the purpose of guiding the reader and providing clarity.

To obtain reasonably robust default settings for co-clustering, four parameters shown in Table 1 are optimized, where bold text indicates optimal settings that are treated as robust default settings. The reason to choose these parameters is they are most relevant to the co-clustering step. The details of this optimization are given here. The following demonstration applies the default settings (bold in Table 1) using single cell and bulk data from mouse brain (Vacher et al. 2021; Ortiz et al. 2020). Both data sets have been simplified for demonstraton purposes.

Table 1: Parameter settings to optimize for co-clustering. Optimal settings are indicated by bold text.
Parameter Settings Description
dimensionReduction denoisePCA (PCA, scran), runUMAP (UMAP, scater) Dimension reduction methods
topDimensions 5 to 80 (11, 19, 33, 22) Number of top dimensions selected for co-clustering
graphBuilding buildKNNGraph (knn), buildSNNGraph (snn) (scran) Methods for building a graph where nodes are cells and edges are connections between nearest neighbors
clusterDetection cluster_walktrap (wt), cluster_fast_greedy (fg), cluster_leading_eigen (le) (igraph) Methods for partitioning the graph to generate clusters

4.3.1 Pre-processing

To obtain reproducible results, a fixed seed is set for generating random numbers.

set.seed(10)

The bulk data (blk.mus) is the same with the tissue-to-cell mapping in the Annotation-based method. The following imports example single cell from the spatialHeatmap package and shows its partial metadata in colData slot.

sc.mus.pa <- system.file("extdata/shinyApp/example", "cell_mouse_cocluster.rds", package="spatialHeatmap") 
sc.mus <- readRDS(sc.mus.pa)
colData(sc.mus)[1:3, , drop=FALSE]
## DataFrame with 3 rows and 1 column
##                cell
##         <character>
## isocort     isocort
## isocort     isocort
## isocort     isocort

Bulk and single cell raw count data are jointly normalized by the function norm_cell, which is a wrapper of computeSumFactors from scran (Lun, McCarthy, and Marioni 2016). com=FALSE means bulk and single cells are separated after normalization for subsequent separate filtering.

#mus.lis.nor <- read_cache(cache.pa, 'mus.lis.nor') 
#if (is.null(mus.lis.nor)) {
  mus.lis.nor <- norm_cell(sce=sc.mus, bulk=blk.mus, com=FALSE)
#  save_cache(dir=cache.pa, overwrite=TRUE, mus.lis.nor)
#}

The normalized single cell and bulk data (log2-scale) are filtered to reduce sparsity and low expression values. In the bulk data, replicates are first aggregated by taking means using the function aggr_rep. Then the filtering retains genes in the bulk data to have expression values of \(\ge\) 1 at a proportion of \(\ge 10\%\) (pOA) across bulk samples and a coefficient of variance (CV) between \(0.1-50\) (Gentleman et al. 2018).

In the single cell data the filtering is set to retain genes with expression values of \(\ge 1\) (cutoff=1) in \(\ge 1\%\) (p.in.gen=0.01) of cells. In addition, only those cells are retained that have expression values \(\ge 1\) (cutoff=1) in \(\ge 10\%\) (p.in.cell=0.1) of their genes.

# Aggregate bulk replicates
blk.mus.aggr <- aggr_rep(data=mus.lis.nor$bulk, assay.na='logcounts', sam.factor='sample', aggr='mean')
# Filter bulk
blk.mus.fil <- filter_data(data=blk.mus.aggr, pOA=c(0.1, 1), CV=c(0.1, 50), verbose=FALSE) 
# Filter cell and subset bulk to genes in cell
blk.sc.mus.fil <- filter_cell(sce=mus.lis.nor$cell, bulk=blk.mus.fil, cutoff=1, p.in.cell=0.1, p.in.gen=0.01, verbose=FALSE) 

Compared to bulk RNA-Seq data, single cell data has a much higher level of sparsity. This difference is reduced by the above filtering and then subsetting the bulk data to the genes remaining in the filtered single cell data. This entire process is accomplished by the filter_cell function.

The same aSVG instance of mouse brain as in the Quick Start section above is used here and the aSVG importing is omitted for brevity.

tail(attribute(svg.mus.brain)[[1]])[1:3, 1:4] # Partial features are shown.
## # A tibble: 3 × 4
##   feature                      id             fill  stroke
##   <chr>                        <chr>          <chr>  <dbl>
## 1 brainstem                    UBERON_0002298 none    0.05
## 2 midbrain                     UBERON_0001891 none    0.05
## 3 dorsal.plus.ventral.thalamus UBERON_0001897 none    0.05

4.3.2 Co-clustering

The co-clustering process is performed by calling the function cocluster. In the following, default settings obtained from optimization are shown, where min.dim, dimred, graph.meth, and cluster refers to topDimensions, dimensionReduction, graphBuilding, and clusterDetection in Table 1 respectively.

#coclus.mus <- read_cache(cache.pa, 'coclus.mus')
#if (is.null(coclus.mus)) {
  coclus.mus <- cocluster(bulk=blk.sc.mus.fil$bulk, cell=blk.sc.mus.fil$cell, min.dim=11, dimred='PCA', graph.meth='knn', cluster='fg')
#  save_cache(dir=cache.pa, overwrite=TRUE, coclus.mus)
#}

The co-clustering results are stored in the colData slot of coclus.mus. The cluster indicates cluster labels, the bulkCell indicates bulk tissues or single cells, the sample suggests original labels of bulk and cells, the assignedBulk refers to bulk tissues assigned to cells with none suggesting un-assigned, and the similarity refers to Spearman’s correlation coefficients for assignments between bulk and cells, which is a measure of assignment strigency.

colData(coclus.mus)
## DataFrame with 4458 rows and 7 columns
##                     cluster    bulkCell          sample    assignedBulk
##                 <character> <character>     <character>     <character>
## cerebral.cortex       clus8        bulk cerebral.cortex            none
## hippocampus           clus5        bulk     hippocampus            none
## hypothalamus          clus7        bulk    hypothalamus            none
## cerebellum            clus7        bulk      cerebellum            none
## isocort               clus8        cell         isocort cerebral.cortex
## ...                     ...         ...             ...             ...
## retrohipp             clus8        cell       retrohipp cerebral.cortex
## retrohipp             clus5        cell       retrohipp     hippocampus
## retrohipp             clus5        cell       retrohipp     hippocampus
## retrohipp             clus5        cell       retrohipp     hippocampus
## retrohipp             clus5        cell       retrohipp     hippocampus
##                  similarity     index sizeFactor
##                 <character> <integer>  <numeric>
## cerebral.cortex        none         1  45.811788
## hippocampus            none         2 100.736625
## hypothalamus           none         3  28.249392
## cerebellum             none         4  44.415396
## isocort               0.545         5   0.711862
## ...                     ...       ...        ...
## retrohipp             0.373      4454   1.006589
## retrohipp              -0.1      4455   0.674636
## retrohipp             0.136      4456   0.646310
## retrohipp             0.127      4457   0.564467
## retrohipp             0.327      4458   0.636357

The strigency of assignments between bulk and cells can be controled by filtering the similarity. This utility is impletmented in function filter_asg, where min.sim is the similarity cutoff. Utilities are also developed to manually tailor the assignments and are explained in the Supplementary Section.

coclus.mus <- filter_asg(coclus.mus, min.sim=0.1)

The co-clusters including bulk and cells can be visualized with the function plot_dim. The dim argument specifies embedding methods and cocluster.only=TRUE sets cells without bulk assignments in gray. In the embedding plot, bulk tissues and cells are indicated by large and small circles respectively.

plot_dim(coclus.mus, dim='PCA', color.by='cluster', cocluster.only=TRUE)
Embedding plot of co-clusters. Large and small circles refer to bulk tissues and single cells respectively.

Figure 9: Embedding plot of co-clusters
Large and small circles refer to bulk tissues and single cells respectively.

4.3.3 Co-visualization

Bulk assignments are treated as group labels for cells in co-visualization. Similar with annotation-based and manual methods, co-visualization options include cell-to-bulk mapping, bulk-to-cell mapping, inclusion or exclusion of expression values.

In cell-to-bulk mapping, single cell data are separated from bulk and expression values are summarized by means within each cell group, i.e. bulk assignement.

# Separate single cell data.
coclus.sc <- subset(coclus.mus, , bulkCell=='cell')
# Summarize expression values in each cell group.
sc.aggr.coclus <- aggr_rep(data=coclus.sc, assay.na='logcounts', sam.factor='assignedBulk', aggr='mean')
colData(sc.aggr.coclus)
## DataFrame with 5 rows and 7 columns
##                     cluster    bulkCell      sample    assignedBulk  similarity
##                 <character> <character> <character>     <character> <character>
## cerebral.cortex       clus8        cell     isocort cerebral.cortex       0.545
## none                  clus2        cell        olfa            none        none
## cerebellum            clus7        cell    striatum      cerebellum         0.3
## hypothalamus          clus7        cell    pallidum    hypothalamus       0.418
## hippocampus           clus5        cell        olfa     hippocampus       0.609
##                     index sizeFactor
##                 <integer>  <numeric>
## cerebral.cortex         5   0.711862
## none                    9   0.687270
## cerebellum            301   0.690039
## hypothalamus          403   0.715932
## hippocampus           500   1.055465

The co-visualization through cell-to-bulk mapping is built on gene Adcy1. Cells in the embedding plot and respective assigned bulk tissues in spatial heatmap are revealed by same colors.

covis(svg=svg.mus.brain, data=sc.aggr.coclus, ID=c('Adcy1'), sce.dimred=coclus.sc, dimred='UMAP', tar.cell=c('hippocampus', 'hypothalamus', 'cerebellum', 'cerebral.cortex'), dim.lgd.text.size=10, dim.lgd.nrow=2, bar.width=0.1, legend.nrow=4)
Co-visualization through cell-to-bulk mapping in automated method. This plot is created on gene Adcy1. Colors between the embedding plot and spatial heatmap indicate matching of cells with bulk tissues.

Figure 10: Co-visualization through cell-to-bulk mapping in automated method
This plot is created on gene Adcy1. Colors between the embedding plot and spatial heatmap indicate matching of cells with bulk tissues.

In bulk-to-cell mapping, bulk data are separated from cell, where replicates are already aggregated in preprocessing.

coclus.blk <- subset(coclus.mus, , bulkCell=='bulk')

Same with conventions in the main vignette, at least one bulk tissue identifier should match with aSVG features so as to successfully plot spatial heatmaps.

colnames(coclus.blk) %in% attribute(svg.mus.brain)[[1]]$feature
## [1] TRUE TRUE TRUE TRUE

The co-visualization through bulk-to-cell mapping is built on gene Adcy1. Cells in the embedding plot and respective assigned bulk tissues in spatial heatmap are revealed by same colors.

covis(svg=svg.mus.brain, data=coclus.blk, ID=c('Adcy1'), sce.dimred=coclus.sc, dimred='UMAP', tar.bulk=colnames(coclus.blk), assay.na='logcounts', legend.nrow=4, dim.lgd.text.size=10, dim.lgd.nrow=2, bar.width=0.08)
Co-visualization through bulk-to-cell mapping in automated method. This plot is created on gene Adcy1. Colors between the embedding plot and spatial heatmap indicate matching of cells with bulk tissues.

Figure 11: Co-visualization through bulk-to-cell mapping in automated method
This plot is created on gene Adcy1. Colors between the embedding plot and spatial heatmap indicate matching of cells with bulk tissues.

By setting profile=FALSE, the co-visualization is created without expression values, where colors in the embedding plot and spatial heatmap only indicate matching between cells and bulk tissues.

covis(svg=svg.mus.brain, data=coclus.blk, ID=c('Adcy1'), sce.dimred=coclus.sc, dimred='UMAP', profile=FALSE, assay.na='logcounts', legend.nrow=4, dim.lgd.text.size=10, dim.lgd.nrow=2, bar.width=0.1)
Co-visualization without expression values in automated method. Colors between the embedding plot and spatial heatmap only indicate matching of cells with bulk tissues.

Figure 12: Co-visualization without expression values in automated method
Colors between the embedding plot and spatial heatmap only indicate matching of cells with bulk tissues.

5 Shiny App

The co-visualization feature is included in the integrated Shiny app that is an GUI implementation of spatialHeatmap, including annotation-based, manual, and automated methods. To start this app, simply call shiny_shm() in R. Below is a screenshot of the co-visulization output.

Screenshot of the co-visualization output in Shiny app. The co-visualization plot is generated by the automatic method.

Figure 13: Screenshot of the co-visualization output in Shiny app
The co-visualization plot is generated by the automatic method.

When using the Shiny app, if bulk and single cell raw count data are provided, column-wise combine them in a SCE object, and format the metadata in the colData slot according to the following rules:

  1. In the bulkCell column, use bulk and cell to indicate bulk and cell samples respectively. If no bulk is included in this column, all samples are considered cells.

  2. If multiple cell group labels (annotation-based, or manually created) are provided, include them in columns of label, label1, label2, and so on respectively. In each of these label columns include corresponding aSVG features as bulk tissue labels.

After formatting the metadata, save the SCE object as an .rds file using saveRDS, then upload the .rds and aSVG file to the app. An example of bulk and single cell data for use in the Shiny app are included in spatialHeatmap and shown below.

shiny.dat.pa <- system.file("extdata/shinyApp/example", "shiny_covis_bulk_cell_mouse_brain.rds", package="spatialHeatmap")
shiny.dat <- readRDS(shiny.dat.pa)
colData(shiny.dat)
## DataFrame with 1061 rows and 4 columns
##                           label          label1    bulkCell      expVar
##                     <character>     <character> <character> <character>
## cerebral.cortex cerebral.cortex cerebral.cortex        bulk     control
## hippocampus         hippocampus     hippocampus        bulk     control
## hypothalamus       hypothalamus    hypothalamus        bulk     control
## cerebellum           cerebellum      cerebellum        bulk     control
## cerebral.cortex cerebral.cortex cerebral.cortex        bulk     control
## ...                         ...             ...         ...         ...
## retrohipp             retrohipp           clus4        cell     control
## retrohipp             retrohipp           clus4        cell     control
## cere                       cere           clus2        cell     control
## cere                       cere           clus2        cell     control
## midbrain               midbrain           clus2        cell     control

6 Supplementary Section

6.1 Tailoring Co-clustering Results

6.1.1 Assigning Desired Bulk in R

The bulk-cell assignments in the automated method can be manually tailored, which is optional. The tailoring can be performed in command line or on a Shiny app. This section illustrates the command-line based tailoring. First visualize single cells in an embedding plot as shown below. In order to define more accurate coordinates in the next step, tune the x-y axis breaks (x.break, y.break) and set panel.grid=TRUE.

plot_dim(coclus.mus, dim='UMAP', color.by='sample', x.break=seq(-10, 10, 1), y.break=seq(-10, 10, 1), panel.grid=TRUE)
UMAP embedding plot of mouse brain bulk tissues and single cells. Bulk tissues and single cells are indicated by large and small circles respectively.

Figure 14: UMAP embedding plot of mouse brain bulk tissues and single cells
Bulk tissues and single cells are indicated by large and small circles respectively.

Define desired bulk tissues (desiredBulk) for cells selected by x-y coordinate ranges (x.min, x.max, y.min, y.max) in the embedding plot in form of a data.frame (df.desired.bulk). The dimred reveals where the coordinates come from and is required. In this example, cell populations near the bulk hippocampus are selected and hippocampus is the desired bulk tissue.

df.desired.bulk <- data.frame(x.min=c(6), x.max=c(8), y.min=c(-1), y.max=c(0.5), desiredBulk=c('hippocampus'), dimred='UMAP')
df.desired.bulk
##   x.min x.max y.min y.max desiredBulk dimred
## 1     6     8    -1   0.5 hippocampus   UMAP

Incorporate desired bulk assignments to co-clustering results by calling refine_asg. The similarities corresponding to desired bulk are internally set at the maximum of 1. After inclusion of desired bulk, separate single cells from bulk for co-visualization.

# Incorporate desired bulk
coclus.mus.tailor <- refine_asg(sce.all=coclus.mus, df.desired.bulk=df.desired.bulk)
# Separate cells from bulk
coclus.sc.tailor <- subset(coclus.mus.tailor, , bulkCell=='cell')

After tailoring, the co-visualization through bulk-to-cell mapping is built on gene Adcy1. To reveal the tailoring result, only hippocampus is shown through the argument tar.bulk in the plot, where cells defined in df.desired.bulk have the same color as the desired bulk hippocampus. hippocampus cells in the embedding plot include tailored cells in df.desired.bulk and those labeled hippocampus in coclustering.

covis(svg=svg.mus.brain, data=coclus.blk, ID=c('Adcy1'), sce.dimred=coclus.sc.tailor, dimred='UMAP', tar.bulk=c('hippocampus'), assay.na='logcounts', legend.nrow=4, dim.lgd.text.size=10, dim.lgd.nrow=2, bar.width=0.08)
Co-visualization through bulk-to-cell mapping after tailoring. This plot is created on gene Adcy1. Only hippocampus cells and tissue are shown to display the tailoring result.

Figure 15: Co-visualization through bulk-to-cell mapping after tailoring
This plot is created on gene Adcy1. Only hippocampus cells and tissue are shown to display the tailoring result.

6.1.2 Assigning Desired Bulk on Shiny App

This section describes tailoring co-clustering results on the convenience Shiny app, which is lauched by calling desired_bulk_shiny.

Figure 16 is the screenshot of the Shiny app. The file to upload is the co-clustering result returned by cocluster, here coclus.mus. It should be saved in an .rds file by using saveRDS before uploaded to the app. On the left embedding plot, cells are selected with the “Lasso Select” tool. On the right, selected cells and their coordinates are listed in a table, and the desired bulk tissues (aSVG features) can be selected from the dropdown list, here hippocampus. To download the table just click the “Download” button. The “Help” button gives more instructions.

Screenshot of the Shiny app for selecting desired bulk tissues. On the left is the embedding plot of single cells, where target cells are selected with the "Lasso Select" tool. On the right, desired bulk tissues are assigned for selected cell.

Figure 16: Screenshot of the Shiny app for selecting desired bulk tissues
On the left is the embedding plot of single cells, where target cells are selected with the “Lasso Select” tool. On the right, desired bulk tissues are assigned for selected cell.

An example of desired bulk downloaded from the convenience Shiny app is shown below. The x-y coordinates refer to single cells in embbeding plots (dimred). The df.desired.bulk is ready to use in the tailoring section.

desired.blk.pa <- system.file("extdata/shinyApp/example", "selected_cells_with_desired_bulk.txt", package="spatialHeatmap")
df.desired.bulk <- read.table(desired.blk.pa, header=TRUE, row.names=1, sep='\t')
df.desired.bulk[1:3, ]
##           x          y  key desiredBulk dimred
## 1 -7.564745 0.07702301 1039 hippocampus   UMAP
## 2 -7.541277 0.07503040 1040 hippocampus   UMAP
## 3 -7.554182 0.05069406 1046 hippocampus   UMAP


7 Version Informaion

sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] kableExtra_1.3.4            BiocParallel_1.32.0        
##  [3] igraph_1.3.5                scater_1.26.0              
##  [5] ggplot2_3.3.6               scran_1.26.0               
##  [7] scuttle_1.8.0               SingleCellExperiment_1.20.0
##  [9] SummarizedExperiment_1.28.0 Biobase_2.58.0             
## [11] GenomicRanges_1.50.0        GenomeInfoDb_1.34.0        
## [13] IRanges_2.32.0              S4Vectors_0.36.0           
## [15] BiocGenerics_0.44.0         MatrixGenerics_1.10.0      
## [17] matrixStats_0.62.0          spatialHeatmap_2.4.0       
## [19] knitr_1.40                  BiocStyle_2.26.0           
## 
## loaded via a namespace (and not attached):
##   [1] utf8_1.2.2                shinydashboard_0.7.2     
##   [3] tidyselect_1.2.0          RSQLite_2.2.18           
##   [5] AnnotationDbi_1.60.0      htmlwidgets_1.5.4        
##   [7] grid_4.2.1                Rtsne_0.16               
##   [9] munsell_0.5.0             ScaledMatrix_1.6.0       
##  [11] codetools_0.2-18          preprocessCore_1.60.0    
##  [13] interp_1.1-3              statmod_1.4.37           
##  [15] withr_2.5.0               colorspace_2.0-3         
##  [17] filelock_1.0.2            highr_0.9                
##  [19] rstudioapi_0.14           labeling_0.4.2           
##  [21] GenomeInfoDbData_1.2.9    farver_2.1.1             
##  [23] bit64_4.0.5               vctrs_0.5.0              
##  [25] generics_0.1.3            xfun_0.34                
##  [27] BiocFileCache_2.6.0       fastcluster_1.2.3        
##  [29] R6_2.5.1                  doParallel_1.0.17        
##  [31] ggbeeswarm_0.6.0          rsvd_1.0.5               
##  [33] locfit_1.5-9.6            rsvg_2.3.2               
##  [35] bitops_1.0-7              cachem_1.0.6             
##  [37] gridGraphics_0.5-1        DelayedArray_0.24.0      
##  [39] assertthat_0.2.1          promises_1.2.0.1         
##  [41] scales_1.2.1              nnet_7.3-18              
##  [43] beeswarm_0.4.0            gtable_0.3.1             
##  [45] beachmat_2.14.0           WGCNA_1.71               
##  [47] rlang_1.0.6               genefilter_1.80.0        
##  [49] systemfonts_1.0.4         splines_4.2.1            
##  [51] lazyeval_0.2.2            impute_1.72.0            
##  [53] checkmate_2.1.0           BiocManager_1.30.19      
##  [55] yaml_2.3.6                reshape2_1.4.4           
##  [57] backports_1.4.1           httpuv_1.6.6             
##  [59] Hmisc_4.7-1               tools_4.2.1              
##  [61] bookdown_0.29             ggplotify_0.1.0          
##  [63] ellipsis_0.3.2            gplots_3.1.3             
##  [65] jquerylib_0.1.4           RColorBrewer_1.1-3       
##  [67] ggdendro_0.1.23           dynamicTreeCut_1.63-1    
##  [69] Rcpp_1.0.9                plyr_1.8.7               
##  [71] base64enc_0.1-3           sparseMatrixStats_1.10.0 
##  [73] visNetwork_2.1.2          zlibbioc_1.44.0          
##  [75] purrr_0.3.5               RCurl_1.98-1.9           
##  [77] rpart_4.1.19              deldir_1.0-6             
##  [79] viridis_0.6.2             ggrepel_0.9.1            
##  [81] cluster_2.1.4             magrittr_2.0.3           
##  [83] magick_2.7.3              data.table_1.14.4        
##  [85] grImport_0.9-5            mime_0.12                
##  [87] evaluate_0.17             xtable_1.8-4             
##  [89] XML_3.99-0.12             jpeg_0.1-9               
##  [91] gridExtra_2.3             compiler_4.2.1           
##  [93] tibble_3.1.8              KernSmooth_2.23-20       
##  [95] crayon_1.5.2              htmltools_0.5.3          
##  [97] later_1.3.0               Formula_1.2-4            
##  [99] geneplotter_1.76.0        tidyr_1.2.1              
## [101] DBI_1.1.3                 dbplyr_2.2.1             
## [103] MASS_7.3-58.1             rappdirs_0.3.3           
## [105] Matrix_1.5-1              cli_3.4.1                
## [107] parallel_4.2.1            metapod_1.6.0            
## [109] pkgconfig_2.0.3           flashClust_1.01-2        
## [111] foreign_0.8-83            plotly_4.10.0            
## [113] xml2_1.3.3                foreach_1.5.2            
## [115] svglite_2.1.0             annotate_1.76.0          
## [117] vipor_0.4.5               bslib_0.4.0              
## [119] dqrng_0.3.0               webshot_0.5.4            
## [121] XVector_0.38.0            rvest_1.0.3              
## [123] yulab.utils_0.0.5         stringr_1.4.1            
## [125] digest_0.6.30             RcppAnnoy_0.0.20         
## [127] Biostrings_2.66.0         rmarkdown_2.17           
## [129] htmlTable_2.4.1           uwot_0.1.14              
## [131] edgeR_3.40.0              DelayedMatrixStats_1.20.0
## [133] curl_4.3.3                shiny_1.7.3              
## [135] gtools_3.9.3              lifecycle_1.0.3          
## [137] jsonlite_1.8.3            BiocNeighbors_1.16.0     
## [139] viridisLite_0.4.1         limma_3.54.0             
## [141] fansi_1.0.3               pillar_1.8.1             
## [143] lattice_0.20-45           KEGGREST_1.38.0          
## [145] fastmap_1.1.0             httr_1.4.4               
## [147] survival_3.4-0            GO.db_3.16.0             
## [149] glue_1.6.2                FNN_1.1.3.1              
## [151] UpSetR_1.4.0              png_0.1-7                
## [153] iterators_1.0.14          bluster_1.8.0            
## [155] bit_4.0.4                 stringi_1.7.8            
## [157] sass_0.4.2                blob_1.2.3               
## [159] DESeq2_1.38.0             BiocSingular_1.14.0      
## [161] latticeExtra_0.6-30       caTools_1.18.2           
## [163] memoise_2.0.1             dplyr_1.0.10             
## [165] irlba_2.3.5.1

8 Funding

This project has been funded by NSF awards: PGRP-1546879, PGRP-1810468, PGRP-1936492.

9 References

Amezquita, Robert, Aaron Lun, Etienne Becht, Vince Carey, Lindsay Carpp, Ludwig Geistlinger, Federico Marini, et al. 2020. “Orchestrating Single-Cell Analysis with Bioconductor.” Nature Methods 17: 137–45. https://www.nature.com/articles/s41592-019-0654-x.

Chang, Winston, Joe Cheng, JJ Allaire, Cars on Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2021. Shiny: Web Application Framework for R. https://CRAN.R-project.org/package=shiny.

Gentleman, R, V Carey, W Huber, and F Hahne. 2018. “Genefilter: Methods for Filtering Genes from High-Throughput Experiments.” http://bioconductor.uib.no/2.7/bioc/html/genefilter.html.

Love, Michael I., Wolfgang Huber, and Simon Anders. 2014. “Moderated Estimation of Fold Change and Dispersion for Rna-Seq Data with Deseq2.” Genome Biology 15 (12): 550. https://doi.org/10.1186/s13059-014-0550-8.

Lun, Aaron T. L., Davis J. McCarthy, and John C. Marioni. 2016. “A Step-by-Step Workflow for Low-Level Analysis of Single-Cell Rna-Seq Data with Bioconductor.” F1000Res. 5: 2122. https://doi.org/10.12688/f1000research.9501.2.

Marques, Sueli, Amit Zeisel, Simone Codeluppi, David van Bruggen, Ana Mendanha Falcão, Lin Xiao, Huiliang Li, et al. 2016. “Oligodendrocyte Heterogeneity in the Mouse Juvenile and Adult Central Nervous System.” Science 352 (6291): 1326–9.

Ortiz, Cantin, Jose Fernandez Navarro, Aleksandra Jurek, Antje Märtin, Joakim Lundeberg, and Konstantinos Meletis. 2020. “Molecular Atlas of the Adult Mouse Brain.” Science Advances 6 (26): eabb3446.

Risso, Davide, and Michael Cole. 2022. ScRNAseq: Collection of Public Single-Cell Rna-Seq Datasets.

Vacher, Claire-Marie, Helene Lacaille, Jiaqi J O’Reilly, Jacquelyn Salzbank, Dana Bakalar, Sonia Sebaoui, Philippe Liere, et al. 2021. “Placental Endocrine Function Shapes Cerebellar Development and Social Behavior.” Nat. Neurosci. 24 (10): 1392–1401.

Zhang, Jianhai, Jordan Hayes, Le Zhang, Bing Yang, Wolf Frommer, Julia Bailey-Serres, and Thomas Girke. 2022. SpatialHeatmap: SpatialHeatmap. https://github.com/jianhaizhang/spatialHeatmap.