spatialHeatmap 1.2.0
The spatialHeatmap package provides functionalities for visualizing cell-,
tissue- and organ-specific data of biological assays by coloring the
corresponding spatial features defined in anatomical images according to a
numeric color key. The color scheme used to represent the assay values can be
customized by the user. This core functionality of the package is called a
spatial heatmap (SHM) plot. It is enhanced with visualization tools for groups
of measured items (e.g. gene modules) sharing related abundance profiles, including
matrix heatmaps combined with hierarchical clustering dendrograms and network representations.
The functionalities of spatialHeatmap can be used either in a command-driven mode
from within R or a graphical user interface (GUI) provided by a Shiny App that
is also part of this package. While the R-based mode provides flexibility to
customize and automate analysis routines, the Shiny App includes a variety of
convenience features that will appeal to experimentalists and other users less
familiar with R. Moreover, the Shiny App can be used on both local computers as
well as centralized server-based deployments (e.g. cloud-based or custom
servers) that can be accessed remotely as a public web service for using
spatialHeatmap’s functionalities with community and/or private data. The
functionalities of the spatialHeatmap
package are illustrated in Figure
1.
Figure 1: Overview of spatialHeatmap
(A) The saptialHeatmap package plots numeric assay data onto spatially annotated images. A wide range of omics technologies is supported including genomic, transcriptomic, proteomic and metabolomic profiling data. The assay data can be provided as numeric vectors, tabular data, or SummarizedExperiment objects. The latter is a widely used data container for organizing both assay data as well as associated annotation and experimental design data. (B) Anatomical and other spatial images need to be provided as annotated SVG (aSVG) files where the spatial features and the corresponding data components of the assay data have matching labels (e.g. tissue labels). (C) The assay data are used to color the matching spatial features in one or more aSVG images according to a color key. The result is called a spatial heatmap (SHM) or spatiotemporal heatmap (STHM) plot. Multiple measurements can be visualized in the same plot, such as several factors (e.g. genes, proteins, metabolites), treatment conditions, growth stages and more. (D) Data mining graphics, such as matrix heatmaps and network graphs, are integrated to facilitate the identification of factors with similar assay profiles. The functionalities of spatialHeatmap can be accessed from local computers via the R console or a graphical user interface based on Shiny. In addition, the latter can be deployed as a web service on custom servers or cloud-based systems.
As anatomical images the package supports both tissue maps from public repositories and custom images provided by the user. In general any type of image can be used as long as it can be provided in SVG (Scalable Vector Graphics) format, where the corresponding spatial features have been defined (see aSVG below). The numeric values plotted onto an SHM are usually quantitative measurements from a wide range of profiling technologies, such as microarrays, next generation sequencing (e.g. RNA-Seq and scRNA-Seq), proteomics, metabolomics, or many other small- or large-scale experiments. For convenience, several preprocessing and normalization methods for the most common use cases are included that support raw and/or preprocessed data. Currently, the main application domains of the spatialHeatmap package are numeric data sets and spatially mapped images from biological, agricultural and biomedical areas. Moreover, the package has been designed to also work with many other spatial data types, such a population data plotted onto geographic maps. This high level of flexibility is one of the unique features of spatialHeatmap. Related software tools for biological applications in this field are largely based on pure web applications (Lekschas et al. 2015; Papatheodorou et al. 2018; Winter et al. 2007; Waese et al. 2017) or local tools (Maag 2018; Muschelli, Sweeney, and Crainiceanu 2014) that typically lack customization functionalities. These restrictions limit users to utilizing pre-existing expression data and/or fixed sets of anatomical image collections. To close this gap for biological use cases, we have developed spatialHeatmap as a generic R/Bioconductor package for plotting quantitative values onto any type of spatially mapped images in a programmable environment and/or in an intuitive to use GUI application.
The core feature of spatialHeatmap
is to map assay values (e.g.
gene expression data) of one or many items (e.g. genes) measured under
different conditions in form of numerically graded colors onto the
corresponding cell types or tissues represented in a chosen SVG image. In the
gene profiling field, this feature supports comparisons of the expression
values among multiple genes by plotting their SHMs next to each
other. Similarly, one can display the expression values of a single or multiple
genes across multiple conditions in the same plot (Figure 4). This level of flexibility is
very efficient for visualizing complicated expression patterns across genes,
cell types and conditions. In case of more complex anatomical images with
overlapping multiple layer tissues, it is important to visually expose the
tissue layer of interest in the plots. To address this, several default and
customizable layer viewing options are provided. They allow to hide features in
the top layers by making them transparent in order to expose features below
them. This transparency viewing feature is highlighted below in the mouse
example (Figure 5). Except for spatial data, this package also works on spatiotemporal data and generates spatiotemporal heatmaps (STHMs, Figure 9). Moreover, one can plot multiple distinct
aSVGs in a single SHM plot as shown in Figure 11. This is
particularly useful for displaying abundance trends across multiple development
stages, where each is represented by its own aSVG image. In addition to
static SHM representations, one can visualize them in form of dynamic animations
embedded in interactive HTML files or generate videos for them.
To maximize reusability and extensibility, the package organizes large-scale
omics assay data along with the associated experimental design information in a
SummarizedExperiment
object (Figure 1A). The latter is one of the core S4 classes within
the Bioconductor ecosystem that has been widely adapted by many other software
packages dealing with gene-, protein- and metabolite-level profiling data
(Morgan et al. 2018). In case of gene expression data, the assays
slot of
the SummarizedExperiment
container is populated with a gene expression
matrix, where the rows and columns represent the genes and tissue/conditions,
respectively, while the colData
slot contains sample data including replicate
information. The tissues and/or cell type information in the object maps via
colData
to the corresponding features in the SVG images using unique
identifiers for the spatial features (e.g. tissues or cell types). This
allows to color the features of interest in an SVG image according to the
numeric data stored in a SummarizedExperiment
object. For simplicity the
numeric data can also be provided as numeric vectors
or data.frames
. This
can be useful for testing purposes and/or the usage of simple data sets that
may not require the more advanced features of the SummarizedExperiment
class,
such as measurements with only one or a few data points. The details about how to
access the SVG images and properly format the associated expression data are
provided in the Supplementary Section of this vignette.
SHMs are images where colors encode numeric values in features of any shape. For plotting SHMs, Scalable Vector Graphics (SVG) has been chosen as image format since it is a flexible and widely adapted vector graphics format that provides many advantages for computationally embedding numerical and other information in images. SVG is based on XML formatted text describing all components present in images, including lines, shapes and colors. In case of biological images suitable for SHMs, the shapes often represent anatomical or cell structures. To assign colors to specific features in SHMs, annotated SVG (aSVG) files are used where the shapes of interest are labeled according to certain conventions so that they can be addressed and colored programmatically. SVGs and aSVGs of anatomical structures can be downloaded from many sources including the repositories described below. Alternatively, users can generate them themselves with vector graphics software such as Inkscape. Typically, in aSVGs one or more shapes of a feature of interest, such as the cell shapes of an organ, are grouped together by a common feature identifier. Via these group identifiers one or many feature types can be colored simultaneously in an aSVG according to biological experiments assaying the corresponding feature types with the required spatial resolution. Correct assignment of image features and assay results is assured by using for both the same feature identifiers. The color gradient used to visually represent the numeric assay values is controlled by a color gradient parameter. To visually interpret the meaning of the colors, the corresponding color key is included in the SHM plots. Additional details for properly formatting and annotating both aSVG images and assay data are provided in the Supplementary Section section of this vignette.
If not generated by the user, SHMs can be generated with data downloaded from
various public repositories. This includes gene, protein and metabolic
profiling data from databases, such as GEO,
BAR and Expression
Atlas from EMBL-EBI (Papatheodorou et al. 2018). A
particularly useful resource, when working with spatialHeatmap
, is the EBI
Expression Atlas. This online service contains both assay and anatomical
images. Its assay data include mRNA and protein profiling experiments for
different species, tissues and conditions. The corresponding anatomical image
collections are also provided for a wide range of species including animals and
plants. In spatialHeatmap
several import functions are provided to work with
the expression and aSVG repository from the Expression Atlas
directly. The aSVG images developed by the spatialHeatmap
project are
available in its own repository called spatialHeatmap aSVG
Repository,
where users can contribute their aSVG images that are formatted according to
our guidlines.
The following sections of this vignette showcase the most important
functionalities of the spatialHeatmap
package using as initial example a simple
to understand toy data set, and then more complex mRNA profiling data from the
Expression Atlas and GEO databases. First, SHM plots are generated for both the toy
and mRNA expression data. The latter include gene expression data sets from
RNA-Seq and microarray experiments of Human Brain, Mouse
Organs, Chicken Organs, and Arabidopsis Shoots. The
first three are RNA-Seq data from the Expression
Atlas, while the last one is a microarray data
set from GEO. Second, gene context
analysis tools are introduced, which facilitate the visualization of
gene modules sharing similar expression patterns. This includes the
visualization of hierarchical clustering results with traditional matrix
heatmaps (Matrix Heatmap) as well co-expression network plots
(Network). Third, an overview of the corresponding Shiny App
is presented that provides access to the same functionalities as the R
functions, but executes them in an interactive GUI environment (Chang et al., n.d.; Chang and Borges Ribeiro 2018). Fourth, more advanced features for plotting customized
SHMs are covered using the Human Brain data set as an example.
The spatialHeatmap
package should be installed from an R (version \(\ge\) 3.6)
session with the BiocManager::install
command.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("spatialHeatmap")
Next, the packages required for running the sample code in this vignette need to be loaded.
library(spatialHeatmap); library(SummarizedExperiment); library(ExpressionAtlas); library(GEOquery)
The following lists the vignette(s) of this package in an HTML browser. Clicking the corresponding name will open this vignette.
browseVignettes('spatialHeatmap')
SHMs are plotted with the spatial_hm
function. To provide a quick
and intuitive overview how these plots are generated, the following uses a
generalized toy example where a small vector of random numeric values is
generated that are used to color features in an aSVG image. The image chosen
for this example is an aSVG depicting the human brain. The corresponding image
file ‘homo_sapiens.brain.svg’ is included in this package fo