1 Introduction

dittoSeq is a tool built to enable analysis and visualization of single-cell and bulk RNA-sequencing data by novice, experienced, and color blind coders. Thus, it provides many useful visualizations, which all utilize red-green color blindness-optimized colors by default, and which allow sufficient customizations, via discrete inputs, for out-of-the-box creation of publication-ready figures.

For single-cell data, dittoSeq works directly with data pre-processed in other popular packages (Seurat, scater, scran, …). For bulk RNAseq data, dittoSeq’s import functions will convert bulk RNAseq data of various different structures into a set structure that dittoSeq helper and visualization functions can work with. So ultimately, dittoSeq includes universal plotting and helper functions for working with (sc)RNAseq data processed and stored in these formats:

Single-Cell:

Seurat (versions 2 & 3)
SingleCellExperiment

Bulk:

SummarizedExperiment (the general Bioconductor Seq-data storage system)
DESeqDataSet (DESeq2 package output)
DGEList (edgeR package output)

For bulk data, or if your data is currently not analyzed, or simply not in one of these structures, you can still pull it in to the SingleCellExperiment structure that dittoSeq works with using the importDittoBulk function.

1.1 Color blindness friendliness:

The default colors of this package are red-green color blindness friendly. To make it so, I used the suggested colors from (Wong 2011) and adapted them slightly by appending darker and lighter versions to create a 24 color vector. All plotting functions use these colors, stored in dittoColors(), by default.

Additionally:

Shapes displayed in the legends are generally enlarged as this can be almost as helpful as the actual color choice for colorblind individuals.
When sensible, dittoSeq funcitons have a shape.by input for having groups displayed through shapes rather than color. (But note: even as a red-green color impaired individual myself writing this vignette, I recommend using color and I generally only use shapes for showing additional groupings.)
dittoDimPlots can be generated with letters overlaid (set do.letter = TRUE)
The Simulate function allows a cone-typical individual to see what their dittoSeq plots might look like to a colorblind individual.

1.2 Disclaimer

Code used here for dataset processing and normalization should not be seen as a suggestion of the proper methods for performing such steps. dittoSeq is a visualization tool, and my focus while developing this vignette has been simply creating values required for providing visualization examples.

2 Installation

dittoSeq is available through Bioconductor.

# Install BiocManager if needed
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# Install dittoSeq
BiocManager::install("dittoSeq")

2.1 Some setup for this vignette that can be ignored

Here, we will need to do some prep as the dataset we will use from Baron et al. (2016) is not normalized nor dimensionality reduced.

library(dittoSeq)
library(scRNAseq)
library(SingleCellExperiment)
library(Seurat)
# Download data
sce <- BaronPancreasData()
# Trim to only 5 of the celltypes for simplicity of vignette
sce <- sce[,meta("label",sce) %in% c(
    "acinar", "endothelial", "gamma", "delta", "ductal")]

Now that we have a single-cell dataset loaded, we are ready to go. All functions work for either Seurat or SCE encapsulated single-cell data.

But to make full use of dittoSeq, we should reaally have this data log-normalized, and dimensionality reductions and clustering run.

# Make Seurat and grab metadata
seurat <- CreateSeuratObject(counts(sce))
seurat <- AddMetaData(seurat, sce$label, col.name = "celltype")
seurat <- AddMetaData(seurat, sce$donor, col.name = "Sample")
seurat <- AddMetaData(seurat,
                      PercentageFeatureSet(seurat, pattern = "^MT"),
                      col.name = "percent.mt")
# Basic Seurat workflow (possibly outdated, but fine for this vignette)
seurat <- NormalizeData(seurat, verbose = FALSE)
seurat <- FindVariableFeatures(object = seurat, verbose = FALSE)
seurat <- ScaleData(object = seurat, verbose = FALSE)
seurat <- RunPCA(object = seurat, verbose = FALSE)
seurat <- RunTSNE(object = seurat)
seurat <- FindNeighbors(object = seurat, verbose = FALSE)
seurat <- FindClusters(object = seurat, verbose = FALSE)

# Grab PCA, TSNE, clustering, log-norm data, and metadata to sce

# sce <- as.SingleCellExperiment(seurat)
# At the time this vignette was made, the above function gave warnings

# So... manual method
sce <- addDimReduction(
  sce, embeddings = Embeddings(seurat, reduction = "pca"), name = "PCA")
sce <- addDimReduction(
  sce, embeddings = Embeddings(seurat, reduction = "tsne"), name = "TSNE")
sce$idents <- seurat$seurat_clusters
assay(sce, "logcounts") <- GetAssayData(seurat)
sce$percent.mt <- seurat$percent.mt
sce$celltype <- seurat$celltype
sce$Sample <- seurat$Sample

Now that we have a single-cell dataset loaded and analyzed in Seurat, let’s convert it to an SCE for examples purposes.

All functions will work the same for either the Seurat or SCE version.

3 Getting started

3.1 Single-cell RNAseq data

dittoSeq works directly with Seurat and SingleCellExperiment objects. Nothing special is needed. Just load in your data if it isn’t already loaded, then go!

dittoDimPlot(seurat, "Sample")

dittoPlot(seurat, "ENO1", group.by = "celltype")

dittoBarPlot(sce, "celltype", group.by = "Sample")

3.2 Bulk RNAseq data

Bulk RNAseq data is handled by dittoSeq using the SingleCellExperiment structure (as of version 0.99). This structure is essentially very similar to the Bioconductor standard SummarizedExperiment, just with room added for storing calculated dimensionality reductions.

# First, lets make some mock expression and conditions data
exp <- matrix(rpois(20000, 5), ncol=20)
colnames(exp) <- paste0("sample", seq_len(ncol(exp)))
rownames(exp) <- paste0("gene", seq_len(nrow(exp)))
logexp <- logexp <- log2(exp + 1)

conditions <- factor(rep(1:4, 5))
sex <- c(rep("M", 9), rep("F", 11))

3.2.1 Standard bulk data import workflow

Importing bulk data can be accomplished with just the importDittoBulk() function, but metadata and dimensionality reductions can also be added after.

# Import
myRNA <- importDittoBulk(
    # x can be a DGEList, a DESeqDataSet, a SummarizedExperiment,
    #   or a list of data matrices
    x = list(counts = exp,
             logcounts = logexp),
    # Optional inputs:
    #   For adding metadata
    metadata = data.frame(conditions = conditions,
                          sex = sex),
    #   For adding dimensionality reductions
    reductions = list(pca = matrix(rnorm(20000), nrow=20)))

# Add metadata (metadata can alternatively be added in this way)
myRNA$conditions <- conditions
myRNA$sex <- sex

# Add dimensionality reductions (can alternatively be added this way)
# (We aren't actually calculating PCA here.)
myRNA <- addDimReduction(
    object = myRNA,
    embeddings = matrix(rnorm(20000), nrow=20),
    name = "pca",
    key = "PC")

Additional details:

By default, provided metadata is added to (or replaces) any metadata already inside of x, but the combine_metadata input can additionally be used to turn retention of previous metadata slots off.

When providing expression data as a list of a single or multiple matrices, it is recommended that matrices containing raw feature counts data be named counts, and log-normalized counts data be named logcounts.

DGEList note: The import function attempts to pull in all information stored in common DGEList slots ($counts, $samples, $genes, $AveLogCPM, $common.dispersion, $trended.dispersion, $tagwise.dispersion, and $offset), but any other slots are ignored.

# Now making plots operates the exact same way as for single-cell data
dittoDimPlot(myRNA, "sex", size = 3, do.ellipse = TRUE)

dittoBarPlot(myRNA, "sex", group.by = "conditions")

dittoBoxPlot(myRNA, "gene1", group.by = "sex")

dittoHeatmap(myRNA, getGenes(myRNA)[1:10],
    annot.by = c("conditions", "sex"))

4 Helper Functions

dittoSeq’s helper functions make it easy to determine the metadata, gene, and dimensionality reduction options for plotting.

4.1 Metadata

# Retrieve all metadata slot names
getMetas(seurat)

## [1] "orig.ident"      "nCount_RNA"      "nFeature_RNA"    "celltype"       
## [5] "Sample"          "percent.mt"      "RNA_snn_res.0.8" "seurat_clusters"

# Query for the presence of a metadata slot
isMeta("nCount_RNA", seurat)

## [1] TRUE

# Retrieve metadata values:
meta("celltype", seurat)[1:10]

## human1_lib1.final_cell_0001 human1_lib1.final_cell_0002 
##                    "acinar"                    "acinar" 
## human1_lib1.final_cell_0003 human1_lib1.final_cell_0004 
##                    "acinar"                    "acinar" 
## human1_lib1.final_cell_0005 human1_lib1.final_cell_0006 
##                    "acinar"                    "acinar" 
## human1_lib1.final_cell_0008 human1_lib1.final_cell_0009 
##                    "acinar"                    "acinar" 
## human1_lib1.final_cell_0010 human1_lib1.final_cell_0011 
##                    "acinar"                    "acinar"

# Retrieve unique values of a metadata
metaLevels("celltype", seurat)

## [1] "acinar"      "delta"       "ductal"      "endothelial" "gamma"

4.2 Genes/Features

# Retrieve all gene names
getGenes(seurat)[1:10]

##  [1] "A1BG"   "A1CF"   "A2M"    "A2ML1"  "A4GALT" "A4GNT"  "AA06"   "AAAS"  
##  [9] "AACS"   "AACSP1"

# Query for the presence of a gene(s)
isGene("CD3E", seurat)

## [1] TRUE

isGene(c("CD3E","ENO1","INS","non-gene"), seurat, return.values = TRUE)

## [1] "CD3E" "ENO1" "INS"

# Retrieve gene expression values:
gene("ENO1", seurat)[1:10]

## human1_lib1.final_cell_0001 human1_lib1.final_cell_0002 
##                   1.8491277                   1.7317730 
## human1_lib1.final_cell_0003 human1_lib1.final_cell_0004 
##                   1.3761065                   1.1225042 
## human1_lib1.final_cell_0005 human1_lib1.final_cell_0006 
##                   2.5082110                   1.5707490 
## human1_lib1.final_cell_0008 human1_lib1.final_cell_0009 
##                   0.9840699                   2.0208192 
## human1_lib1.final_cell_0010 human1_lib1.final_cell_0011 
##                   1.2962655                   1.5031189

4.3 Reductions

# Retrieve all dimensionality reductions
getReductions(seurat)

## [1] "pca"  "tsne"

These are what can be provided to reduction.use for dittoDimPlot().

4.4 Bulk versus single-cell

Because dittoSeq utilizes the SingleCellExperiment structure to handle bulk RNAseq data, there is a getter and setter for the internal metadata which tells dittoSeq functions which resolution of data a target SCE holds.

# Getter
isBulk(sce)

## [1] FALSE

isBulk(myRNA)

## [1] TRUE

# Setter
mock_bulk <- setBulk(sce) # to bulk
mock_sc <- setBulk(myRNA, set = FALSE) # to single-cell

NOTE: for any non-SCE objects, isBulk() will always return FALSE

5 Visualizations

There are many different types of dittoSeq visualizations. Each has intuitive defaults which allow creation of immediately useable plots. Each also has many additional tweaks avaiolable through discrete input that can help ensure you can create publication-quality plots out-of-the-box.

5.1 dittoDimPlot & dittoScatterPlot

These show cells/samples data overlayed on a scatter plot, with the axes of dittoScatterPlot() being gene expression or metadata data and with the axes of dittoDimPlot() being dimensionality reductions like tsne, pca, umap or similar.

dittoDimPlot(seurat, "celltype")

dittoDimPlot(sce, "ENO1")

dittoScatterPlot(
    object = sce,
    x.var = "ENO1", y.var = "INS",
    color.var = "celltype", shape.by = "Sample",
    size = 3)

dittoScatterPlot(
    object = seurat,
    x.var = "nCount_RNA", y.var = "nFeature_RNA",
    color.var = "percent.mt",
    size = 3)

5.1.1 Many additional dittoDimPlot features

dittoDimPlot has various additional features which can be overlayed on top. Adding each is controlled by an input that starts with add. or do. such as do.label or add.trajectory.lineages. Additional inputs that apply to these features will then start with the XXXX part that comes after add.XXXX or do.XXXX, as exemplified below.

dittoDimPlot(seurat, "ident",
             do.label = TRUE, labels.repel = FALSE)

dittoDimPlot(seurat, "ident",
             add.trajectory.lineages = list(
                 c("3","4","11","8","2","10"),
                 c("3","9","6","12"),
                 c("3","9","7", "1"),
                 c("3","9","7","5")),
             trajectory.cluster.meta = "ident")

5.2 dittoPlot (and dittoRidgePlot + dittoBoxPlot wrappers)

These display continuous cells/samples’ data on a y-axis (or x-axis for ridgeplots) grouped on the x-axis by sample, age, condition, or any discrete grouping metadata. Data can be represented with violin plots, box plots, individual points for each cell/sample, and/or ridge plots. The plots input controls which data representations are used. The group.by input controls how the data are grouped in the x-axis. And the color.by input controls the color that fills in violin, box, and ridge plots.

dittoPlot() is the main function, but dittoRidgePlot() and dittoBoxPlot() are wrappers which essentially just adjust the default for the plots input from c(“jitter”, “vlnplot”) to c(“ridgeplot”) or c(“boxplot”,“jitter”), respectively.

dittoPlot(seurat, "ENO1", group.by = "celltype",
    plots = c("vlnplot", "jitter"))

dittoRidgePlot(sce, "ENO1", group.by = "celltype")

dittoBoxPlot(seurat, "ENO1", group.by = "celltype")

Tweaks to the individual data representation types can be made with discrete inputs, all of which start with the representation types’ name. For example…

dittoPlot(seurat, "ENO1", group.by = "celltype",
    plots = c("vlnplot", "jitter", "boxplot"),
    # change the color and size of jitter points
    jitter.color = "blue", jitter.size = 0.5,
    # change the outline color and width, and remove the fill of boxplots
    boxplot.color = "white", boxplot.width = 0.1,
    boxplot.fill = FALSE,
    # change how the violinplot widths are normalized across groups
    vlnplot.scaling = "count"
    )

5.3 dittoBarPlot

This function displays discrete cells/samples’ data on a y-axis, grouped on the x-axis by sample, age, condition, or any discrete grouping metadata. Data can be represented as percentages or counts, and this is controlled by the scale input.

dittoBarPlot(seurat, "celltype", group.by = "Sample")

dittoBarPlot(seurat, "ident", group.by = "Sample",
    scale = "count")

5.4 dittoHeatmap

This function is essentially a wrapper for generating heatmaps with pheatmap, but with the same automatic, user-friendly, data extraction, (subsetting,) and metadata integrations common to other dittoSeq functions.

For large, many cell, single-cell datasets, it can be necessary to turn off clustering by cells in generating the heatmap because the process is very memory intensive. As an alternative, dittoHeatmap offers the ability to order columns in functional ways using the order.by input. This input will default to the first annotation provided to annot.by for single cell datasets, but can also be controlled separately.

# Pick Genes
genes <- c("SST", "REG1A", "PPY", "INS", "CELA3A", "PRSS2", "CTRB1",
    "CPA1", "CTRB2" , "REG3A", "REG1B", "PRSS1", "GCG", "CPB1",
    "SPINK1", "CELA3B", "CLPS", "OLFM4", "ACTG1", "FTL")

# Annotating and ordering cells by some meaningful feature(s):
dittoHeatmap(seurat, genes,
    annot.by = c("celltype", "Sample"))

dittoHeatmap(seurat, genes,
    annot.by = c("celltype", "Sample"),
    order.by = "Sample")

scaled.to.max = TRUE will normalize all expression data to the max expression of each gene [0,1], which is often useful for zero-enriched single-cell data.

show_colnames/show_rownames control whether cell/gene names will be shown. (show.colnames default is TRUE for bulk, and FALSE for single-cell.)

# Add annotations
dittoHeatmap(seurat, genes,
    annot.by = c("celltype", "Sample"),
    scaled.to.max = TRUE,
    show_colnames = FALSE,
    show_rownames = FALSE)

A subset of the supplied genes can be given to the highlight.genes input to have names shown for just these genes.

# Highlight certain genes
dittoHeatmap(seurat, genes,
    annot.by = c("celltype", "Sample"),
    highlight.genes = genes[1:3])

Additional tweaks can be added through other built in inputs or by providing additional inputs that get passed along to pheatmap (see ?pheatmap).

5.5 Multi-Plotters

These create either multiple plots or create plots that summarize data for multiple variables all in one plot. They make it easier to create sumarzies for many genes or many celltypes without the need for writing loops.

Some setup for these, let’s roughly pick out the markers of delta cells in this dataset

# Idents(seurat) <- "celltype"
# delta.marker.table <- FindMarkers(seurat, ident.1 = "delta")
# delta.genes <- rownames(delta.marker.table)[1:20]
# Idents(seurat) <- "seurat_clusters"

delta.genes <- c("SST", "RBP4", "PCSK1", "CPE", "GPX3",
    "NLRP1", "PPP1R1A", "PCP4", "CHGB", "DHRS2", "LEPR", 
    "PTPRN", "BEX1", "SCGN", "PCSK1N", "SCG5", "UCHL1",
    "CHGA", "GAD2", "SEC11C")

5.5.1 multi_dittoPlot & dittoPlot_VarsAcrossGroups

multi_dittoPlot() creates dittoPlots for multiple genes or metadata, one plot each.

dittoPlotVarsAcrossGroups() creates a dittoPlot-like representation where instead of representing samples/cells as in typical dittoPlots, each data point instead represents the average expression, across each x-grouping, of a gene (or value of a metadata).

multi_dittoPlot(seurat, delta.genes[1:6], group.by = "celltype",
    vlnplot.lineweight = 0.2, jitter.size = 0.3)

dittoPlotVarsAcrossGroups(seurat, delta.genes, group.by = "celltype",
    main = "Delta-cell Markers")

5.5.2 multi_dittoDimPlot & multi_dittoDimPlotVaryCells

multi_dittoDimPlot() creates dittoDimPlots for multiple genes or metadata, one plot each.

multi_dittoDimPlotVaryCells() creates dittoDimPlots for a single gene or metadata, but where distinct cells are highlighted in each plot. The vary.cells.meta input sets the discrete metadata to be used for breaking up cells/samples over distinct plots. This can be useful for checking/highlighting when a gene may be differentially expressed within multiple cell types or accross all samples.

The output of multi_dittoDimPlotVaryCells() is similar to that of faceting using dittoDimPlot’s split.by input, but with added capability of showing an “AllCells” plot as well, or of outputing the individual plots for making manually customized plot arrangements when data.out = TRUE.

multi_dittoDimPlot(seurat, delta.genes[1:6])

multi_dittoDimPlotVaryCells(seurat, delta.genes[1],
    vary.cells.meta = "celltype")

multi_dittoDimPlotVaryCells(seurat, "celltype",
    vary.cells.meta = "celltype")

6 Customization via Simple Inputs

Many adjustments can be made with simple additional inputs. Here, we go through a few that are consistent across most dittoSeq functions, but there are many more. Be sure to check the function documentation (e.g. ?dittoDimPlot) to explore more!

6.1 Subsetting to certain cells/samples

The cells/samples shown in a given plot can be adjusted with the cells.use input. This can be provided as either a list of cells’ / samples’ names to include, or as a logical vector that states whether each cell / sample should be included.

# Original
dittoBarPlot(seurat, "celltype", group.by = "Sample", scale = "count")

# String method, first 10 cells
dittoBarPlot(seurat, "celltype", group.by = "Sample", scale = "count",
    cells.use = colnames(seurat)[1:10])

# Logical method, only acinar cells
dittoBarPlot(seurat, "celltype", group.by = "Sample", scale = "count",
    cells.use = meta("celltype", seurat) == "acinar")

6.2 Faceting with split.by

dittoPlot, dittoDimPlot, and dittoScatterPlots can be split into separate plots for distinct groups of cells with the split.by input.

dittoDimPlot(seurat, "celltype", split.by = "Sample")

dittoDimPlot(seurat, "ENO1", split.by = c("Sample", "celltype"))

6.3 All titles are adjustable.

Relevant inputs are generally main, sub, xlab, ylab, x.labels, and legend.title.

dittoBarPlot(seurat, "celltype", group.by = "Sample",
    main = "Encounters",
    sub = "By Type",
    xlab = NULL, # NULL = remove
    ylab = "Generation 1",
    x.labels = c("Ash", "Misty", "Jessie", "James"),
    legend.title = "Types",
    var.labels.rename = c("Fire", "Water", "Grass", "Electric", "Psychic"),
    x.labels.rotate = FALSE)

As exemplified above, in some functions, the displayed data can be renamed too.

6.4 Colors can be adjusted easily.

Colors are normally set with color.panel or max.color and min.color. When color.panel is used (discrete data), an additional input called colors sets the order in which those are actually used to make swapping around colors easy when nearby clusters appear too similar in tSNE/umap plots!

# original - discrete
dittoDimPlot(seurat, "celltype")

# swapped colors
dittoDimPlot(seurat, "celltype",
    colors = 5:1)

# different colors
dittoDimPlot(seurat, "celltype",
    color.panel = c("red", "orange", "purple", "yellow", "skyblue"))

# original - expression
dittoDimPlot(seurat, "INS")

# different colors
dittoDimPlot(seurat, "INS",
    max.color = "red", min.color = "gray90")

6.5 Underlying data can be output.

Simply add data.out = TRUE to any of the individual plotters and a representation of the underlying data will be output.

dittoBarPlot(seurat, "celltype", group.by = "Sample",
    data.out = TRUE)

## $p

## 
## $data
##          label   grouping count label.count.total     percent
## 1       acinar GSM2230757   110               644 0.170807453
## 2        delta GSM2230757   214               644 0.332298137
## 3       ductal GSM2230757   120               644 0.186335404
## 4  endothelial GSM2230757   130               644 0.201863354
## 5        gamma GSM2230757    70               644 0.108695652
## 6       acinar GSM2230758     3               538 0.005576208
## 7        delta GSM2230758   125               538 0.232342007
## 8       ductal GSM2230758   301               538 0.559479554
## 9  endothelial GSM2230758    23               538 0.042750929
## 10       gamma GSM2230758    86               538 0.159851301
## 11      acinar GSM2230759   843              1508 0.559018568
## 12       delta GSM2230759   161              1508 0.106763926
## 13      ductal GSM2230759   376              1508 0.249336870
## 14 endothelial GSM2230759    92              1508 0.061007958
## 15       gamma GSM2230759    36              1508 0.023872679
## 16      acinar GSM2230760     2               453 0.004415011
## 17       delta GSM2230760   101               453 0.222958057
## 18      ductal GSM2230760   280               453 0.618101545
## 19 endothelial GSM2230760     7               453 0.015452539
## 20       gamma GSM2230760    63               453 0.139072848

For dittoHeatmap, a list of all the arguments that would be supplied to pheatmap are output. This allows users to make their own tweaks to how the expression matrix is represented before plotting, or even to use a different heatmap creator from pheatmap altogether.

dittoHeatmap(seurat, c("SST","CPE","GPX3"), cells.use = colnames(seurat)[1:5],
    data.out = TRUE)

## $mat
##      human1_lib1.final_cell_0001 human1_lib1.final_cell_0002
## SST                     2.726418                   2.7048445
## CPE                     0.000000                   0.5396968
## GPX3                    0.000000                   0.3058240
##      human1_lib1.final_cell_0003 human1_lib1.final_cell_0004
## SST                    2.0163423                    2.474926
## CPE                    0.0000000                    0.000000
## GPX3                   0.4649227                    0.000000
##      human1_lib1.final_cell_0005
## SST                    2.7890247
## CPE                    0.8447536
## GPX3                   0.0000000
## 
## $main
## [1] NA
## 
## $show_colnames
## [1] FALSE
## 
## $show_rownames
## [1] TRUE
## 
## $color
##  [1] "#0000FF" "#0A0AFF" "#1414FF" "#1F1FFF" "#2929FF" "#3434FF" "#3E3EFF"
##  [8] "#4848FF" "#5353FF" "#5D5DFF" "#6868FF" "#7272FF" "#7C7CFF" "#8787FF"
## [15] "#9191FF" "#9C9CFF" "#A6A6FF" "#B0B0FF" "#BBBBFF" "#C5C5FF" "#D0D0FF"
## [22] "#DADAFF" "#E4E4FF" "#EFEFFF" "#F9F9FF" "#FFF9F9" "#FFEFEF" "#FFE4E4"
## [29] "#FFDADA" "#FFD0D0" "#FFC5C5" "#FFBBBB" "#FFB0B0" "#FFA6A6" "#FF9C9C"
## [36] "#FF9191" "#FF8787" "#FF7C7C" "#FF7272" "#FF6868" "#FF5D5D" "#FF5353"
## [43] "#FF4848" "#FF3E3E" "#FF3434" "#FF2929" "#FF1F1F" "#FF1414" "#FF0A0A"
## [50] "#FF0000"
## 
## $cluster_cols
## [1] FALSE
## 
## $border_color
## [1] NA
## 
## $scale
## [1] "row"
## 
## $breaks
## [1] NA
## 
## $legend_breaks
## [1] NA

6.6 plotly hovering can be added.

Any dittoSeq function that normally outputs a ggplot (dittoDimPlot, dittoPlot, dittoBarPlot, dittoPlotVarsAcrossGroups) can be supplied do.hover = TRUE to have it be converted into a plotly object that will display additional data about each data point when the user hovers their cursor on top.

Generally, a second input, hover.data, is used to tell dittoSeq qhat extra data to display. This input takes in a vector of gene or metadata names (or “ident” for seurat object clustering) in the order you wish for them to be displayed.

# These can be finicky to render in knitting, but still, example code:
dittoDimPlot(seurat, "INS",
    do.hover = TRUE,
    hover.data = c("celltype", "Sample", "ENO1", "ident", "nCount_RNA"))
dittoPlot(seurat, "INS", group.by = "celltype", plots = c("vlnplot", "jitter"),
    do.hover = TRUE,
    hover.data = c("celltype", "Sample", "ENO1", "ident", "nCount_RNA"))

When the types of underlying data possible to be shown are constrained because the plot pieces represent summary data (dittoBarPlot and dittoPlotVarsAcrossGroups), just do.hover is enough:

# These can be finicky to render in knitting, but still, example code:
dittoBarPlot(seurat, "celltype", group.by = "Sample",
    do.hover = TRUE)
dittoPlotVarsAcrossGroups(seurat, delta.genes, group.by = "celltype",
    do.hover = TRUE)

7 Session information

sessionInfo()

## R version 4.0.0 (2020-04-24)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.11-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.11-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] Seurat_3.1.5                scRNAseq_2.2.0             
##  [3] SingleCellExperiment_1.10.1 SummarizedExperiment_1.18.1
##  [5] DelayedArray_0.14.0         matrixStats_0.56.0         
##  [7] Biobase_2.48.0              GenomicRanges_1.40.0       
##  [9] GenomeInfoDb_1.24.0         IRanges_2.22.2             
## [11] S4Vectors_0.26.1            BiocGenerics_0.34.0        
## [13] dittoSeq_1.0.2              ggplot2_3.3.1              
## [15] BiocStyle_2.16.0           
## 
## loaded via a namespace (and not attached):
##   [1] Rtsne_0.15                    colorspace_1.4-1             
##   [3] ellipsis_0.3.1                ggridges_0.5.2               
##   [5] XVector_0.28.0                farver_2.0.3                 
##   [7] leiden_0.3.3                  listenv_0.8.0                
##   [9] ggrepel_0.8.2                 bit64_0.9-7                  
##  [11] interactiveDisplayBase_1.26.3 AnnotationDbi_1.50.0         
##  [13] codetools_0.2-16              splines_4.0.0                
##  [15] knitr_1.28                    jsonlite_1.6.1               
##  [17] ica_1.0-2                     cluster_2.1.0                
##  [19] dbplyr_1.4.4                  png_0.1-7                    
##  [21] uwot_0.1.8                    pheatmap_1.0.12              
##  [23] sctransform_0.2.1             shiny_1.4.0.2                
##  [25] BiocManager_1.30.10           compiler_4.0.0               
##  [27] httr_1.4.1                    lazyeval_0.2.2               
##  [29] assertthat_0.2.1              Matrix_1.2-18                
##  [31] fastmap_1.0.1                 limma_3.44.1                 
##  [33] later_1.1.0.1                 htmltools_0.4.0              
##  [35] tools_4.0.0                   rsvd_1.0.3                   
##  [37] igraph_1.2.5                  gtable_0.3.0                 
##  [39] glue_1.4.1                    GenomeInfoDbData_1.2.3       
##  [41] reshape2_1.4.4                RANN_2.6.1                   
##  [43] dplyr_1.0.0                   rappdirs_0.3.1               
##  [45] Rcpp_1.0.4.6                  vctrs_0.3.1                  
##  [47] ape_5.4                       nlme_3.1-148                 
##  [49] ExperimentHub_1.14.0          lmtest_0.9-37                
##  [51] xfun_0.14                     stringr_1.4.0                
##  [53] globals_0.12.5                mime_0.9                     
##  [55] lifecycle_0.2.0               irlba_2.3.3                  
##  [57] future_1.17.0                 AnnotationHub_2.20.0         
##  [59] edgeR_3.30.3                  zoo_1.8-8                    
##  [61] zlibbioc_1.34.0               MASS_7.3-51.6                
##  [63] scales_1.1.1                  promises_1.1.0               
##  [65] RColorBrewer_1.1-2            yaml_2.2.1                   
##  [67] curl_4.3                      pbapply_1.4-2                
##  [69] memoise_1.1.0                 reticulate_1.16              
##  [71] gridExtra_2.3                 stringi_1.4.6                
##  [73] RSQLite_2.2.0                 BiocVersion_3.11.1           
##  [75] rlang_0.4.6                   pkgconfig_2.0.3              
##  [77] bitops_1.0-6                  evaluate_0.14                
##  [79] lattice_0.20-41               ROCR_1.0-11                  
##  [81] purrr_0.3.4                   labeling_0.3                 
##  [83] htmlwidgets_1.5.1             patchwork_1.0.0              
##  [85] cowplot_1.0.0                 bit_1.1-15.2                 
##  [87] tidyselect_1.1.0              RcppAnnoy_0.0.16             
##  [89] plyr_1.8.6                    magrittr_1.5                 
##  [91] bookdown_0.19                 R6_2.4.1                     
##  [93] magick_2.3                    generics_0.0.2               
##  [95] DBI_1.1.0                     pillar_1.4.4                 
##  [97] withr_2.2.0                   fitdistrplus_1.1-1           
##  [99] survival_3.1-12               RCurl_1.98-1.2               
## [101] tsne_0.1-3                    tibble_3.0.1                 
## [103] future.apply_1.5.0            crayon_1.3.4                 
## [105] KernSmooth_2.23-17            plotly_4.9.2.1               
## [107] BiocFileCache_1.12.0          rmarkdown_2.2                
## [109] locfit_1.5-9.4                grid_4.0.0                   
## [111] data.table_1.12.8             blob_1.2.1                   
## [113] digest_0.6.25                 xtable_1.8-4                 
## [115] tidyr_1.1.0                   httpuv_1.5.3.1               
## [117] munsell_0.5.0                 viridisLite_0.3.0

References

Baron, Maayan, Adrian Veres, Samuel L. Wolock, Aubrey L. Faust, Renaud Gaujoux, Amedeo Vetere, Jennifer Hyoje Ryu, et al. 2016. “A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-Cell Population Structure.” Cell Systems 3 (4):346–360.e4. https://doi.org/10.1016/j.cels.2016.08.011.

Wong, Bang. 2011. “Points of View: Color Blindness.” Nature Methods 8 (6):441–41. https://doi.org/10.1038/nmeth.1618.

Using dittoSeq to visualize (sc)RNAseq data

March 2nd, 2020

Package