Making Complex Heatmaps

Author: Zuguang Gu ( z.gu@dkfz.de )

Date: 2015-04-16

Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential features. Here the ComplexHeatmap package provides a highly flexible way to arrange multiple heatmaps and supports self-defined annotation graphics.

General design

Generally, a heatmap list contains several heatmaps and row annotations.

plot of chunk design

There are several classes that are abstracted from the heatmap. Surrounding the heatmap list, there are legends for heatmaps themselves and annotations, also there are titles which are placed on the four sides of the heatmap list. For each heatmap, there are also different components standing surrounding the heatmap body.

In ComplexHeatmap package, components of heatmap lists are abstracted into several classes.

Heatmap class: a single heatmap containing heatmap body, row/column names, titles, dendrograms and column annotations.
HeatmapList class: a list of heatmaps and row annotations.
HeatmapAnnotation class: defines a list of row annotations and column annotations.

There are also several internal classes:

SingleAnnotation class: defines a single row annotation or column annotation.
ColorMapping class: mapping from values to colors.

ComplexHeatmap is implemented under grid system, so users should know basic grid functionality to get full use of ComplexHeatmap package.

A single heatmap

A single heatmap is mostly used for a quick view of the data. It is a special case of a heatmap list which only contains one heatmap. In following examples, we will demonstrate how to set parameters for visualization a single heatmap.

First let's load packages and generate a random matrix:

library(ComplexHeatmap)
library(circlize)
library(colorspace)
library(GetoptLong)

set.seed(123)

mat = cbind(rbind(matrix(rnorm(16, -1), 4), matrix(rnorm(32, 1), 8)),
            rbind(matrix(rnorm(24, 1), 4), matrix(rnorm(48, -1), 8)))

rownames(mat) = paste0("R", 1:12)
colnames(mat) = paste0("C", 1:10)

Plot the heatmap with default settings. The default style of the heatmap is quite the same as that generated by other similar heatmap functions.

Heatmap(mat)

plot of chunk default

Colors

In most cases, the heatmap visualizes a matrix with continuous values. In this case, user should provide a color mapping function. A color mapping function should accept a vector of values and return a vector of corresponding colors. The colorRamp2() from the circlize package is helpful for generating such functions. The two arguments for colorRamp2() is a vector of breaks values and corresponding colors.

Heatmap(mat, col = colorRamp2(c(-3, 0, 3), c("green", "white", "red")))

plot of chunk color_fun

If the matrix contains discrete values (either numeric or character), colors should be specified as a named vector to make it possible for the mapping from discrete values to colors.

discrete_mat = matrix(sample(1:4, 100, replace = TRUE), 10, 10)
colors = structure(rainbow_hcl(4), names = c("1", "2", "3", "4"))
Heatmap(discrete_mat, col = colors)

plot of chunk discrete_matrix

Or a character matrix:

discrete_mat = matrix(sample(letters[1:4], 100, replace = TRUE), 10, 10)
colors = structure(rainbow_hcl(4), names = letters[1:4])
Heatmap(discrete_mat, col = colors)

plot of chunk discrete_character_matrix

As you see, for the numeric matrix, by default clustering is applied on both dimensions while for character matrix, clustering is suppressed.

Titles

The name of the heatmap is used as the title of the heatmap legend. The name also plays as a unique id if you plot more than one heatmaps together (introduced in later sections).

Heatmap(mat, name = "foo")

plot of chunk with_matrix_name

You can set heatmap titles either by the rows or by the columns. Note at a same time you can only put e.g. column title either on the top or at the bottom of the heatmap. The graphic parameters can be set by row_title_gp and column_title_gp respectively. Please remember you should use gpar() to specify graphic parameters.

Heatmap(mat, name = "foo", column_title = "I am a column title", 
    row_title = "I am a row title")

plot of chunk row_column_title

Heatmap(mat, name = "foo", column_title = "I am a column title at the bottom", 
    column_title_side = "bottom")

plot of chunk row_column_title

Heatmap(mat, name = "foo", column_title = "I am a big column title", 
    column_title_gp = gpar(fontsize = 20, fontface = "bold"))

plot of chunk row_column_title

Clustering

Clustering may be the key feature of the heatmap visualization. In ComplexHeatmap package, clustering is supported with high flexibility. For different levels, you can specify the clustering either by a pre-defined method, or by a distance function, or by a object that already contains clustering, or directly by a clustering function. It is also possible to render your dendrograms with different colors and styles for different branches for better revealing structures of your data.

First there are general settings for the clustering, e.g. whether show dendrograms, side of the dendrograms and size of the dendrograms.

Heatmap(mat, name = "foo", cluster_rows = FALSE)

plot of chunk cluster_basic

Heatmap(mat, name = "foo", show_column_hclust = FALSE)

plot of chunk cluster_basic

Heatmap(mat, name = "foo", row_hclust_side = "right")

plot of chunk cluster_basic

Heatmap(mat, name = "foo", column_hclust_height = unit(2, "cm"))

plot of chunk cluster_basic

There are three ways to specify distance metric for clustering:

specify distance as a pre-defined option. The valid values are the supported methods in dist() function and within pearson, spearman and kendall.
a self-defined function which calculates distance from a matrix. The function should only contain one argument. Please note for clustering on columns, the matrix will be transposed automatically.
a self-defined function which calculates distance from two vectors. The function should only contain two arguments.

Heatmap(mat, name = "foo", clustering_distance_rows = "pearson")

plot of chunk cluster_distance

Heatmap(mat, name = "foo", clustering_distance_rows = function(m) dist(m))

plot of chunk cluster_distance

Heatmap(mat, name = "foo", clustering_distance_rows = function(x, y) 1 - cor(x, y))

plot of chunk cluster_distance

Maybe we can apply a robust clustering based on the pair-wise distance. Also with self-defining col by colorRamp2(), we can make a outlier-unsensitive heatmap.

mat_with_outliers = mat
for(i in  1:10) mat_with_outliers[i, i] = 1000
robust_dist = function(x, y) {
    qx = quantile(x, c(0.1, 0.9))
    qy = quantile(y, c(0.1, 0.9))
    l = x > qx[1] & x < qx[2] & y > qy[1] & y < qy[2]
    x = x[l]
    y = y[l]
    sqrt(sum((x - y)^2))
}
Heatmap(mat_with_outliers, name = "foo", 
    col = colorRamp2(c(-3, 0, 3), c("green", "white", "red")),
    clustering_distance_rows = robust_dist,
    clustering_distance_columns = robust_dist)

plot of chunk cluster_distance_advanced

Method to make hierarchical clustering can be specified by clustering_method_rows and clustering_method_columns. Possible methods are those supported in hclust() function.

Heatmap(mat, name = "foo", clustering_method_rows = "single")

plot of chunk cluster_method

By default, clustering are performed by hclust(). But you can also utilize clustering results which are generated by other methods by specifying cluster_rows or cluster_columns to a hclust or dendrogram object. In following examples, we use diana() and agnes() methods which are from the cluster package to perform clusterings.

library(cluster)
Heatmap(mat, name = "foo", cluster_rows = as.dendrogram(diana(mat)),
   cluster_columns = as.dendrogram(agnes(t(mat))))

plot of chunk cluster_object

You can render your dendrogram object by the dendextend package and make a more customized visualization of the dendrogram.

library(dendextend)
dend = hclust(dist(mat))
dend = color_branches(dend, k = 2)
Heatmap(mat, name = "foo", cluster_rows = dend)

plot of chunk cluster_dendextend

More generally, cluster_rows and cluster_columns can be functions which calculate the clusterings. The input argument should be a matrix and returned value of the function should be a hclust or dendrogram object. Please note, when cluster_rows is executed internally, the argument m is the input mat itself while m is the transpose of mat when executing cluster_columns.

Heatmap(mat, name = "foo", cluster_rows = function(m) hclust(dist(m)),
    cluster_columns = function(m) hclust(dist(m)))

plot of chunk cluster_function

Dimension names

Side, visibility and graphics parameter for dimension names can be set.

Heatmap(mat, name = "foo", row_names_side = "left", row_hclust_side = "right", 
    column_names_side = "top", column_hclust_side = "bottom")

plot of chunk dimension_name

Heatmap(mat, name = "foo", show_row_names = FALSE)

plot of chunk dimension_name

Heatmap(mat, name = "foo", row_names_gp = gpar(fontsize = 20))

plot of chunk dimension_name

Heatmap(mat, name = "foo", row_names_gp = gpar(col = c(rep("red", 4), rep("blue", 8))))

plot of chunk dimension_name

Split heatmap by rows

A heatmap can be split by rows. The km argument with a value larger than 1 means applying a k-means clustering on rows.

Heatmap(mat, name = "foo", km = 2)

plot of chunk k_means

More generally, split can be set to a vector or a data frame in which different combination of levels split the rows of the heatmap. Actually, k-means clustering just generates a vector of row classes and appends split with one additional column. The combined row titles can be controlled by combined_name_fun argument.

Heatmap(mat, name = "foo", split = rep(c("A", "B"), 6))

plot of chunk split

Heatmap(mat, name = "foo", split = data.frame(rep(c("A", "B"), 6), rep(c("C", "D"), each = 6)))

plot of chunk split

Heatmap(mat, name = "foo", split = data.frame(rep(c("A", "B"), 6), rep(c("C", "D"), each = 6)), 
    combined_name_fun = function(x) paste(x, collapse = "\n"))

plot of chunk split

Heatmap(mat, name = "foo", km = 2, split = rep(c("A", "B"), 6), 
    combined_name_fun = function(x) paste(x, collapse = "\n"))

plot of chunk split

Heatmap(mat, name = "foo", km = 2, split = rep(c("A", "B"), 6), combined_name_fun = NULL)

plot of chunk split

If you are not happy with the default k-means partitioning method, it is easy to use other partitioning methods by just assigning the partitioning vector to split.

pa = pam(mat, k = 3)
Heatmap(mat, name = "foo", split = paste0("pam", pa$clustering))

plot of chunk pam

Height of gaps between every row slices can be controlled by gap.

Heatmap(mat, name = "foo", split = paste0("pam", pa$clustering), gap = unit(5, "mm"))

plot of chunk split_gap

Character matrix can only be split by split argument.

Heatmap(discrete_mat, name = "foo", split = rep(letters[1:2], each = 5))

plot of chunk split_discrete_matrix

Self define the heatmap body

The heatmap body can also be self-defined. By default the heatmap body is composed by an array of rectangles with different filled colors. If type in rect_gp is set to none, the array for rectangles is initialized but no graphics are put in. Then, users can define their own graphic function by cell_fun. cell_fun is applied on every cell in the heatmap and provides following information on the 'current' cell:

i: row index in the matrix.
j: column index in the matrix.
x: x coordinate of middle point of the cell which is measured in the viewport of the heatmap body.
y: y coordinate of middle point of the cell which is measured in the viewport of the heatmap body.
width: width of the cell.
height: height of the cell.
fill: color of the cell.

In following example, we draw round rectangles and add text. We assign the heatmap to a variable and make the plot explicitly by calling draw(), just in order to suppress the heatmap legend. This functionality will be introduced in later sections.

mat2 = matrix(sample(1:10, 12, replace = TRUE), 4, 3)

ht = Heatmap(mat2, rect_gp = gpar(col = "white", lwd = 2, type = "none"), 
    cell_fun = function(j, i, x, y, width, height, fill) {
        grid.roundrect(x, y, width*0.8, height*0.8, gp = gpar(fill = fill))
        if(mat2[i, j] <= 4) {
            grid.text("cold", x = x, y = y)
        } else if(mat2[i, j] >= 7) {
            grid.text("hot", x = x, y = y)
        } else {
            grid.text("normal", x = x, y = y)
        }
    },
    cluster_rows = FALSE, cluster_columns = FALSE)
draw(ht, show_heatmap_legend = FALSE)

plot of chunk cell_fun

The HeatmapAnnotation class

In this section, we only describe how to generate a column annotations. Row annotations will be described in later sections.

The annotation graphics actually are quite general. The only common characteristic for column annotation is that they are aligned to the columns of the heatmap. Here there is a HeatmapAnnotation class which is used to define annotations on columns.

Simple annotation

A simple annotation is defined as a vector which contains discrete classes or continuous values corresponding to the columns. Since the simple annotation is represented as a vector, multiple simple annotations can be specified as a data frame. Colors for the simple annotations can be specified by col with discrete values or color mapping functions, depending on whether the simple annotations are discrete or continuous.

When plotting the heatmap, simple annotations will be represented as rows of grids.

There is a draw() method for the HeatmapAnnotation class. draw() is used internally and here we just use it for illustration.

df = data.frame(type = c(rep("a", 5), rep("b", 5)))
ha = HeatmapAnnotation(df = df)
ha

## A HeatmapAnnotation object with 1 annotation.
## 
## An annotation with discrete color mapping
## name: type 
## position: column 
## show legend: TRUE

draw(ha, 1:10)

plot of chunk heatmap_annotation

ha = HeatmapAnnotation(df = df, col = list(type = c("a" =  "red", "b" = "blue")))
ha

## A HeatmapAnnotation object with 1 annotation.
## 
## An annotation with discrete color mapping
## name: type 
## position: column 
## show legend: TRUE

draw(ha, 1:10)

plot of chunk heatmap_annotation_col

ha = HeatmapAnnotation(df = data.frame(age = sample(1:20, 10)),
    col = list(age = colorRamp2(c(0, 20), c("white", "red"))))
ha

## A HeatmapAnnotation object with 1 annotation.
## 
## An annotation with continuous color mapping
## name: age 
## position: column 
## show legend: TRUE

draw(ha, 1:10)

plot of chunk heatmap_annotation_colfun

Put more than one annotations together.

ha = HeatmapAnnotation(df = data.frame(type = c(rep("a", 5), rep("b", 5)),
                                       age = sample(1:20, 10)),
    col = list(type = c("a" = "red", "b" = "blue"),
               age = colorRamp2(c(0, 20), c("white", "red")))
)
ha

## A HeatmapAnnotation object with 2 annotations.
## 
## An annotation with discrete color mapping
## name: type 
## position: column 
## show legend: TRUE 
## 
## An annotation with continuous color mapping
## name: age 
## position: column 
## show legend: TRUE

draw(ha, 1:10)

plot of chunk heatmap_annotation_mixed

Complex annotations

Besides simple annotations, these are complex annotations. The complex annotations are always represented as self-defined functions. Actually, for each column annotation, there will be a viewport created waiting for graphics. The annotation function here defines how to put the graphics to this viewport. The only argument of the function is an index of column which may be adjusted by column clustering.

In following example, an annotation of points is created. Please note how we define xscale so that positions of points correspond to middle points of the columns.

value = rnorm(10)
column_anno = function(index) {
    n = length(index)
    pushViewport(viewport(xscale = c(0.5, n + 0.5), yscale = range(value)))
    grid.points(index, value, pch = 16)
    upViewport() # this is very important in order not to mess up the layout
}
ha = HeatmapAnnotation(points = column_anno)
ha

## A HeatmapAnnotation object with 1 annotation.
## 
## An annotation with self-defined function
## name: points 
## position: column

draw(ha, 1:10)

plot of chunk heatmap_annotation_complex

For simple annotation graphics, you can create such annotation function by anno_points() or anno_barplot(). anno_points() will return an annotation function which just satisfy our needs.

ha = HeatmapAnnotation(points = anno_points(value))
draw(ha, 1:10)

plot of chunk heatmap_annotation_points

ha = HeatmapAnnotation(points = anno_barplot(value))
draw(ha, 1:10)

plot of chunk heatmap_annotation_barplot

There is also an anno_boxplot() which generates boxplot for each column in the matrix.

ha = HeatmapAnnotation(boxplot = anno_boxplot(mat))
draw(ha, 1:10)

plot of chunk heatmap_annotation_boxplot

You can combine more than one annotations into the object.

ha = HeatmapAnnotation(df = df, points = anno_points(value))
ha

## A HeatmapAnnotation object with 2 annotations.
## 
## An annotation with discrete color mapping
## name: type 
## position: column 
## show legend: TRUE 
## 
## An annotation with self-defined function
## name: points 
## position: column

draw(ha, 1:10)

plot of chunk heatmap_annotation_mixed_with_complex

If there are more than one annotations, you can control height of each annotation by annotation_height. The value of annotation_height can either be numeric values or unit objects. But when you specify the height to the unit objects, you should make sure the sum of heights does not exceed the height of the annotations shown in the heatmap.

ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat),
    annotation_height = c(1, 2, 3))
draw(ha, 1:10)

plot of chunk unnamed-chunk-1

ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat),
    annotation_height = unit.c(unit(1, "null"), unit(3, "cm"), unit(3, "cm")))
draw(ha, 1:10)

plot of chunk unnamed-chunk-2

With the annotation, you can assign in to the heatmap either by top_annotation or bottom_annotation. Also you can control the size of total column annotations by top_annotation_height and bottom_annotation_height.

ha = HeatmapAnnotation(df = df, points = anno_points(value))
ha_boxplot = HeatmapAnnotation(boxplot = anno_boxplot(mat))
Heatmap(mat, name = "foo", top_annotation = ha, bottom_annotation = ha_boxplot, 
    bottom_annotation_height = unit(2, "cm"))

plot of chunk add_annotation

You can suppress some of the annotation legend by specifying show_legend to FALSE when creating the HeatmapAnnotation object.

ha = HeatmapAnnotation(df = df, show_legend = FALSE)
Heatmap(mat, name = "foo", top_annotation = ha)

plot of chunk annotation_show

More types of annotations which show data distribution in corresponding rows and columns can be supported by anno_histogram() and anno_density().

ha_mix_top = HeatmapAnnotation(histogram = anno_histogram(mat, gp = gpar(fill = rep(2:3, each = 5))),
    density_line = anno_density(mat, type = "line", gp = gpar(col = rep(2:3, each = 5))),
    violin = anno_density(mat, type = "violin", gp = gpar(fill = rep(2:3, each = 5))),
    heatmap = anno_density(mat, type = "heatmap"))
ha_mix_right = HeatmapAnnotation(histogram = anno_histogram(mat, gp = gpar(fill = rep(2:3, each = 5)), which = "row"),
    density_line = anno_density(mat, type = "line", gp = gpar(col = rep(2:3, each = 5)), which = "row"),
    violin = anno_density(mat, type = "violin", gp = gpar(fill = rep(2:3, each = 5)), which = "row"),
    heatmap = anno_density(mat, type = "heatmap", which = "row"), 
    which = "row", width = unit(8, "cm"))
Heatmap(mat, name = "foo", top_annotation = ha_mix_top, top_annotation_height = unit(8, "cm")) + ha_mix_right

plot of chunk annotation_more

The HeatmapAnnotation is also used for construction of row annotations which we will introduce in later sections.

A list of heatmaps

You can arrange more than one heatmaps which are placed columns by columns. Actually, one single heatmap is just a special case of the heatmap list of length one.

Heatmap() is actually a class constructor function for a single heatmap. If more than one heatmaps are to be combined, users can append one heatmap to the other by + operator.

mat = matrix(rnorm(80, 2), 8, 10)
mat = rbind(mat, matrix(rnorm(40, -2), 4, 10))
rownames(mat) = paste0("R", 1:12)
colnames(mat) = paste0("C", 1:10)

ht1 = Heatmap(mat, name = "ht1")
ht2 = Heatmap(mat, name = "ht2")
class(ht1)

## [1] "Heatmap"
## attr(,"package")
## [1] "ComplexHeatmap"

class(ht2)

## [1] "Heatmap"
## attr(,"package")
## [1] "ComplexHeatmap"

ht1 + ht2

plot of chunk heatmap_list_default

Under default mode, dendrograms from the second heatmap will be removed and row orders will be same as the first one.

The returned value of addition of two heatmaps is a HeatmapList object. Directly call ht_list object will call draw() method with default settings. With explicitly calling draw() method, you can have more control on the legend and titles.

ht_list = ht1 + ht2
class(ht_list)

## [1] "HeatmapList"
## attr(,"package")
## [1] "ComplexHeatmap"

You can also append any number of heatmaps to the heatmap list. Also you can append a heatmap list to a heatmap list.

ht1 + ht1 + ht1
ht1 + ht_list
ht_list + ht1
ht_list + ht_list

Titles

A heatmap list also has titles which are independent to the heatmap titles.

ht1 = Heatmap(mat, name = "ht1", row_title = "Heatmap 1", column_title = "Heatmap 1")
ht2 = Heatmap(mat, name = "ht2", row_title = "Heatmap 2", column_title = "Heatmap 2")
ht_list = ht1 + ht2

draw(ht_list, row_title = "Two heatmaps, row title", row_title_gp = gpar(col = "red"),
    column_title = "Two heatmaps, column title", column_title_side = "bottom")

plot of chunk heatmap_list_title

Legends

Legends for all heatmaps and all annotations will be drawn together. The legends for heatmaps and legends for annotations are put in independent viewports.

df = data.frame(type = c(rep("a", 5), rep("b", 5)))
ha = HeatmapAnnotation(df = df, col = list(type = c("a" =  "red", "b" = "blue")))

ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", top_annotation = ha)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2")
ht_list = ht1 + ht2

draw(ht_list)

plot of chunk legend

draw(ht_list, heatmap_legend_side = "left", annotation_legend_side = "bottom")

plot of chunk legend

draw(ht_list, show_heatmap_legend = FALSE, show_annotation_legend = FALSE)

plot of chunk legend

You can choose to only add some of the heatmap legends by setting show_heatmap_legend to a logical value.

ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", top_annotation = ha)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", show_heatmap_legend = FALSE)
ht1 + ht2

plot of chunk legend_show

ComplexHeatmap only generate legends for heatmaps and simple annotations. Self-defined legends can be passed by annotation_legend_list as a list of grob objects.

ha = HeatmapAnnotation(points = anno_points(rnorm(10)))
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", top_annotation = ha, show_heatmap_legend = FALSE)
lgd = legendGrob(c("dots"), pch = 16)
draw(ht1 + ht2, annotation_legend_list = list(lgd))

plot of chunk self_defined_legend

Graphic parameters for legends for simple annotations can be set. Actually these arguments are passed to color_mapping_legend() on ColorMapping class.

draw(ht1 + ht2, legend_grid_width = unit(6, "mm"), 
    legend_title_gp = gpar(fontsize = 14, fontface = "bold"))

plot of chunk heatmap_list_advanced

Gaps between heatmaps

The gaps between heatmaps can be set by gap argument by a unit object.

draw(ht_list, gap = unit(1, "cm"))

plot of chunk heatmap_list_gap

draw(ht_list + ht_list, gap = unit(c(3, 6, 9, 0), "mm"))

plot of chunk heatmap_list_gap

Size of heatmaps

The width for some (not all) heatmaps can be set to a fixed width.

ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1")
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2", width = unit(5, "cm"))
ht1 + ht2

plot of chunk heatmap_list_size

Auto adjustment

There are some automatic adjustment if more than one heatmaps are plotted. There should be a main heatmap which by default is the first one. Some settings for the remaining heatmaps will be modified to the settings in the main heatmap. The adjustment are:

row clusters are removed.
row titles are removed.
if the main heatmap is split by rows, all remaining heatmaps will also be split by same levels as the main one.

The main heatmap can be specified by main_heatmap argument. The value can be a numeric index or the name of the heatmap (of course, you need to set the heatmap name when you create the Heatmap object).

ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", km = 2)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2")
ht1 + ht2

plot of chunk heatmap_list_auto_adjust

draw(ht2 + ht1)

plot of chunk heatmap_list_auto_adjust

draw(ht2 + ht1, main_heatmap = "ht1")

plot of chunk heatmap_list_auto_adjust

If there is no row clustering in the main heatmap, all other heatmaps have no row clustering neither.

ht1 = Heatmap(mat, name = "ht1", column_title = "Heatmap 1", cluster_rows = FALSE)
ht2 = Heatmap(mat, name = "ht2", column_title = "Heatmap 2")
ht1 + ht2

plot of chunk heatmap_list_auto_adjust_no_row_cluster

Heatmap list with row annotations

Row annotations

Row annotation is also defined by the HeatmapAnnotation class, but with specifying which to row.

df = data.frame(type = c(rep("a", 6), rep("b", 6)))
ha = HeatmapAnnotation(df = df, which = "row", width = unit(1, "cm"))
draw(ha, 1:12)

plot of chunk row_annotation

Mix heatmaps and row annotations

Essentially, row annotations and column annotations are identical graphics, but in applications, there is some difference. In ComplexHeatmap package, row annotations have the same place as the heatmap while column annotations are just like accessory components of heatmaps. For row annotations, similar as heatmaps, you can append the row annotations to heatmap or heatmap list or even row annotation object itself.

ht1 = Heatmap(mat, name = "ht1")
ht1 + ha + ht1

plot of chunk heatmap_list_with_row_annotation

Also complex row annotations can be added. Pleast note in following code, the dendrogram and row title are put on the two sides of the heatmap list by specifying row_hclust_side and row_sub_title_side.

ht1 = Heatmap(mat, name = "ht1", km = 2)
ha_boxplot = HeatmapAnnotation(boxplot = anno_boxplot(mat, which = "row"), 
    which = "row", width = unit(2, "cm"))
draw(ha_boxplot + ht1, row_hclust_side = "left", row_sub_title_side = "right")

plot of chunk heatmap_list_with_row_annotation_complex

For row annotations, more complex annotation graphics can help to visualize the data:

ht1 = Heatmap(mat, name = "ht1", km = 2)
random_data = lapply(1:12, function(x) runif(20))
random_data[1:6] = lapply(random_data[1:6], function(x) x*0.5)
ha = HeatmapAnnotation(distribution = function(index) {
        random_data = random_data[index]
        n = length(index)

        for(i in seq_len(n)) {
            y = random_data[[i]]
            pushViewport(viewport(x= 0, y = (i-0.5)/n, width = unit(1, "npc") - unit(1, "cm"), 
                height = 1/n*0.8, just = "left", xscale = c(0, 20), yscale = c(0, 1)))
            if(index[i] %in% 1:6) fill = "blue" else fill = "red"
            grid.polygon(c(1:20, 20, 1), c(y, 0, 0), default.units = "native", 
                gp = gpar(fill = fill, col = NA))
            grid.yaxis(main = FALSE, gp = gpar(fontsize = 8))
            upViewport()
        }
    }, which = "row", width = unit(10, "cm"))
draw(ht1 + ha, row_hclust_side = "left", row_sub_title_side = "right")

plot of chunk complex_heatmap

Access components

Each components of the heatmap/heatmap list has a name. You can go to any viewport by seekViewport(). Following figure almost contains all types of components.

ha_column1 = HeatmapAnnotation(points = anno_points(rnorm(10)))
ht1 = Heatmap(mat, name = "ht1", km = 2, row_title = "Heatmap 1", column_title = "Heatmap 1", 
    top_annotation = ha_column1)

ha_column2 = HeatmapAnnotation(df = data.frame(type = c(rep("a", 5), rep("b", 5))))
ht2 = Heatmap(mat, name = "ht2", row_title = "Heatmap 2", column_title = "Heatmap 2",
    bottom_annotation = ha_column2)

ht_list = ht1 + ht2
draw(ht_list, row_title = "Heatmap list", column_title = "Heatmap list", 
    heatmap_legend_side = "right", annotation_legend_side = "left")

plot of chunk access_components

The components that have names are:

global: the viewport which contains the whole figure.
global_column_title: the viewport which contains column title for the heatmap list.
global_row_title: the viewport which contains row title for the heatmap list.
main_heatmap_list: the viewport which contains a list of heatmaps and row annotations.
heatmap_@{heatmap_name}: the viewport which contains a single heatmap
annotation_@{annotation_name}: the viewport which contains an annotation either on columns or rows.
@{heatmap_name}_heatmap_body_@{i}: the heatmap body.
@{heatmap_name}_column_title: column title for a single heatmap.
@{heatmap_name}_row_title_@{i}: since a heatmap body may be splitted into several parts. @{i} is the index of the row slice.
@{heatmap_name}_hclust_row_@{i}: dendrogram for ith row slice.
@{heatmap_name}_hclust_column: dendrogram on columns
@{heatmap_name}_row_names_@{i}: the viewport which contains row names.
@{heatmap_name}_column_names: the viewport which contains column names.
heatmap_legend: the viewport which contains all heatmap legends.
legend_@{heatmap_name}: the viewport which contains a single heatmap legend.
annotation_legend: the viewport which contains all annotation legends.
legend_@{annotation_name}: the viewport which contains a single annotation legend.

ht_list

seekViewport("annotation_points")
grid.text("points", unit(0, "npc") - unit(2, "mm"), 0.5, default.units = "npc", just = "right")

seekViewport("ht1_heatmap_body_2")
grid.text("outlier", 1.5/10, 2.5/4, default.units = "npc")

seekViewport("annotation_type")
grid.text("type", unit(1, "npc") + unit(2, "mm"), 0.5, default.units = "npc", just = "left")

plot of chunk unnamed-chunk-5

Real applications

Data frame is an ideal data structure to store different subjects of information in which each row in the data frame focuses on the same element. Then, each subject can be visualized as a single heatmap and putting all subjects together can give you a full view of your data.

In gene expression matrix, rows correspond to genes. More information about genes can be attached after the expression heatmap such as gene length and type of genes.

expr = readRDS(paste0(system.file(package = "ComplexHeatmap"), "/extdata/gene_expression.rds"))
mat = as.matrix(expr[, grep("cell", colnames(expr))])

type = gsub("s\\d+_", "", colnames(mat))
ha = HeatmapAnnotation(df = data.frame(type = type))

Heatmap(mat, name = "expression", km = 5, top_annotation = ha, 
    top_annotation_height = unit(4, "mm"), show_row_names = FALSE, 
    show_column_names = FALSE) +
Heatmap(expr$length, name = "length", col = colorRamp2(c(0, 100000), c("white", "orange")),
    width = unit(5, "mm")) +
Heatmap(expr$type, name = "type", width = unit(5, "mm")) +
Heatmap(expr$chr, name = "chr", col = rand_color(length(unique(expr$chr))), 
    width = unit(5, "mm"))

plot of chunk expression_example

Following examples is OncoPrint. The basic idea is to self define the heatmap body. Besides the default style which is provided by cBioPortal, there are additional barplots at both sides of the heatmap which show numbers of different alterations for each sample and for each gene. Source code is available here.

plot of chunk unnamed-chunk-6

Following examples visualizes correlation between methylation and expression, as well as other annotation information (data are randomly generated). In the heatmap, each row corresponds to a differentially methylated regions (DMRs). From left to right, heatmaps are:

methylation for each DMR (by rows) in samples.
direction of the methylation (one column heatmap), i.e. is methylation hyper in tumor or hypo?
expression for the genes that are associated with corresponding DMRs (e.g. closest gene).
significance for the correlation between methylation and expression (-log10(p-value)).
type of genes, i.e. is the gene a protein coding gene or a lincRNA?
annotation to gene models, i.e. is the DMR located in the intragenic region of the corresponding gene or the DMR is intergenic?
distance from the DMR to the TSS of the corresponding gene.
overlapping between DMRs and enhancers (Color shows how much the DMR is covered by the enhancers).

Source code is available here.

plot of chunk unnamed-chunk-7

Session info

sessionInfo()

## R version 3.2.0 (2015-04-16)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.2 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
##  [4] LC_COLLATE=C               LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
## [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] dendextend_0.18.3    cluster_2.0.1        GetoptLong_0.1.0     colorspace_1.2-6    
## [5] circlize_0.2.4       ComplexHeatmap_1.0.0 knitr_1.9            markdown_0.7.4      
## 
## loaded via a namespace (and not attached):
##  [1] formatR_1.1         magrittr_1.5        evaluate_0.6        GlobalOptions_0.0.6
##  [5] whisker_0.3-2       RColorBrewer_1.1-2  rjson_0.2.15        tools_3.2.0        
##  [9] stringr_0.6.2       shape_1.4.2