To view a phylogenetic tree, we first need to parse the tree file into R. The ggtree (Yu et al. 2017) package supports many file formats via the treeio package, including output files of commonly used software packages in evolutionary biology. For more details, plase refer to the treeio vignette.

library("treeio")
library("ggtree")

nwk <- system.file("extdata", "sample.nwk", package="treeio")
tree <- read.tree(nwk)

Viewing a phylogenetic tree with ggtree

The ggtree package extends ggplot2 (Wickham 2009) package to support viewing phylogenetic tree. It implements geom_tree layer for displaying phylogenetic tree, as shown below:

The function, ggtree, was implemented as a short cut to visualize a tree, and it works exactly the same as shown above.

ggtree takes all the advantages of ggplot2. For example, we can change the color, size and type of the lines as we do with ggplot2.

By default, the tree is viewed in ladderize form, user can set the parameter ladderize = FALSE to disable it.

The branch.length is used to scale the edge, user can set the parameter branch.length = “none” to only view the tree topology (cladogram) or other numerical variable to scale the tree (e.g. dN/dS, see also in Tree Annotation vignette).

Layout

Currently, ggtree supports several layouts, including:

for phylogram (by default) and cladogram if user explicitly setting branch.length=‘none’. Unrooted (equal angle and daylight methods), time-scaled and 2-dimensional layouts are also supported.

Time-scaled tree

A phylogenetic tree can be scaled by time (time-scaled tree) by specifying the parameter, mrsd (most recent sampling date).

Two dimensional tree

ggtree implemented two dimensional tree. It accepts parameter yscale to scale the y-axis based on the selected tree attribute. The attribute should be numerical variable. If it is character/category variable, user should provides a name vector of mapping the variable to numeric by passing it to parameter yscale_mapping.

In this example, the figure demonstrates the quantity of y increase along the trunk. User can highlight the trunk with different line size or color using the functions described in Tree Manipulation vignette.

Displaying tree scale (evolution distance)

To show tree scale, user can use geom_treescale() layer.

geom_treescale() supports the following parameters:

We can also use theme_tree2() to display the tree scale by adding x axis.

Tree scale is not restricted to evolution distance, ggtree can re-scale the tree with other numerical variable. More details can be found in the Tree Annotation vignette.

Displaying nodes/tips

Showing all the internal nodes and tips in the tree can be done by adding a layer of points using geom_nodepoint, geom_tippoint or geom_point.

Displaying labels

Users can use geom_text or geom_label to display the node (if available) and tip labels simultaneously or geom_tiplab to only display tip labels:

geom_tiplab not only supports using text or label geom to display labels, it also supports image geom to label tip with image files. A corresponding geom, geom_nodelab is also provided for displaying node labels. For details of label nodes with images, please refer to the vignette, Annotating phylogenetic tree with images.

For circular and unrooted layout, ggtree supports rotating node labels according to the angles of the branches.

To make it more readable for human eye, ggtree provides a geom_tiplab2 for circular layout (see post 1 and 2).

By default, the positions are based on the node positions, we can change them to based on the middle of the branch/edge.

Based on the middle of branch is very useful when annotating transition from parent node to child node.

Update tree view with a new tree

In previous example, we have a p object that stored the tree viewing of 13 tips and internal nodes highlighted with specific colored big dots. If users want to apply this pattern (we can imaging a more complex one) to a new tree, you don’t need to build the tree step by step. ggtree provides an operator, %<%, for applying the visualization pattern to a new tree.

For example, the pattern in the p object will be applied to a new tree with 50 tips as shown below:

Theme

theme_tree() defined a totally blank canvas, while theme_tree2() adds phylogenetic distance (via x-axis). These two themes all accept a parameter of bgcolor that defined the background color. Users can pass any theme components to the theme_tree() function to modify them.

Visualize a list of trees

ggtree supports multiPhylo object and a list of trees can be viewed simultaneously.

One hundred bootstrap trees can also be view simultaneously.

Another way to view the bootstrap trees is to merge them together to form a density tree. We can add a layer of the best tree on the top of the density tree.

Rescale tree

Most of the phylogenetic trees are scaled by evolutionary distance (substitution/site). In ggtree, users can re-scale a phylogenetic tree by any numerical variable inferred by evolutionary analysis (e.g. dN/dS).

## 'treedata' S4 object that stored information of
##  '/tmp/Rtmpg627dd/Rinst132e6c545719/ggtree/examples/MCC_FluA_H3.tree'.
## 
## ...@ phylo: 
## Phylogenetic tree with 76 tips and 75 internal nodes.
## 
## Tip labels:
##  A/Hokkaido/30-1-a/2013, A/New_York/334/2004, A/New_York/463/2005, A/New_York/452/1999, A/New_York/238/2005, A/New_York/523/1998, ...
## 
## Rooted; includes branch lengths.
## 
## with the following features available:
##  'height',   'height_0.95_HPD',  'height_median',    'height_range', 'length',
##  'length_0.95_HPD',  'length_median',    'length_range', 'posterior',    'rate',
##  'rate_0.95_HPD',    'rate_median',  'rate_range'.

In addition to specify branch.length in tree visualization, users can change branch length stored in tree object by using rescale_tree function.

Zoom on a portion of tree

ggtree provides gzoom function that similar to zoom function provided in ape. This function plots simultaneously a whole phylogenetic tree and a portion of it. It aims at exploring very large trees.

Zoom in selected clade of a tree that was already annotated with ggtree is also supported.

Color tree

In ggtree, coloring phylogenetic tree is easy, by using aes(color=VAR) to map the color of tree based on a specific variable (numeric and category are both supported).

User can use any feature (if available), including clade posterior and dN/dS etc., to scale the color of the tree.

References

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. 1st ed. Springer.

Yu, Guangchuang, David K. Smith, Huachen Zhu, Yi Guan, and Tommy Tsan-Yuk Lam. 2017. “Ggtree: An R Package for Visualization and Annotation of Phylogenetic Trees with Their Covariates and Other Associated Data.” Methods in Ecology and Evolution 8 (1):28–36. https://doi.org/10.1111/2041-210X.12628.