To view a phylogenetic tree, we first need to parse the tree file into R
. The ggtree
package supports many file formats including output files of commonly used software packages in evolutionary biology. For more details, plase refer to the Tree Data Import vignette.
library("ggtree")
nwk <- system.file("extdata", "sample.nwk", package="treeio")
tree <- read.tree(nwk)
Viewing a phylogenetic tree with ggtree
The ggtree
package extends ggplot2
package to support viewing phylogenetic tree. It implements geom_tree
layer for displaying phylogenetic tree, as shown below:
The function, ggtree
, was implemented as a short cut to visualize a tree, and it works exactly the same as shown above.
ggtree
takes all the advantages of ggplot2
. For example, we can change the color, size and type of the lines as we do with ggplot2
.
By default, the tree is viewed in ladderize form, user can set the parameter ladderize = FALSE
to disable it.
The branch.length
is used to scale the edge, user can set the parameter branch.length = "none"
to only view the tree topology (cladogram) or other numerical variable to scale the tree (e.g. dN/dS, see also in Tree Annotation vignette).
Layout
Currently, ggtree
supports several layouts, including:
rectangular
(by default)slanted
circular
fan
for Phylogram
(by default) and Cladogram
if user explicitly setting branch.length='none'
. ggtree
also supports unrooted
layout.
Phylogram
rectangular
slanted
circular
fan
Cladogram
rectangular
slanted
circular
fan
Unrooted
Unrooted layout was implemented by the equal-angle algorithm
that described in Inferring Phylogenies1.
## Average angle change [ 1 ] 0.2206985
## Average angle change [ 2 ] 0.0397772
Time-scaled tree
A phylogenetic tree can be scaled by time (time-scaled tree) by specifying the parameter, mrsd
(most recent sampling date).
Two dimensional tree
ggtree
implemented two dimensional tree. It accepts parameter yscale
to scale the y-axis based on the selected tree attribute. The attribute should be numerical variable. If it is character/category variable, user should provides a name vector of mapping the variable to numeric by passing it to parameter yscale_mapping
.
ggtree(tree2d, mrsd = "2014-05-01",
yscale="NGS", yscale_mapping=c(N2=2, N3=3, N4=4, N5=5, N6=6, N7=7)) +
theme_classic() + theme(axis.line.x=element_line(), axis.line.y=element_line()) +
theme(panel.grid.major.x=element_line(color="grey20", linetype="dotted", size=.3),
panel.grid.major.y=element_blank()) +
scale_y_continuous(labels=paste0("N", 2:7))
In this example, the figure demonstrates the quantity of y increase along the trunk. User can highlight the trunk with different line size or color using the functions described in Tree Manipulation vignette.
Displaying tree scale (evolution distance)
To show tree scale, user can use geom_treescale()
layer.
geom_treescale()
supports the following parameters:
x
andy
for tree scale positionwidth
for the length of the tree scalefontsize
for the size of the textlinesize
for the size of the lineoffset
for relative position of the line and the textcolor
for color of the tree scale
We can also use theme_tree2()
to display the tree scale by adding x axis
.
Tree scale is not restricted to evolution distance, ggtree
can re-scale the tree with other numerical variable. More details can be found in the Tree Annotation vignette.
Displaying nodes/tips
Showing all the internal nodes and tips in the tree can be done by adding a layer of points using geom_nodepoint
, geom_tippoint
or geom_point
.
p <- ggtree(tree) + geom_nodepoint(color="#b5e521", alpha=1/4, size=10)
p + geom_tippoint(color="#FDAC4F", shape=8, size=3)
Displaying labels
Users can use geom_text
to display the node (if available) and tip labels simultaneously or geom_tiplab
to only display tip labels:
For circular
and unrooted
layout, ggtree
supports rotating node labels according to the angles of the branches.
To make it more readable for human eye, ggtree
provides a geom_tiplab2
for circular
layout (see post 1 and 2).
By default, the positions are based on the node positions, we can change them to based on the middle of the branch/edge.
Based on the middle of branch is very useful when annotating transition from parent node to child node.
update tree view with a new tree
In previous example, we have a p
object that stored the tree viewing of 13 tips and internal nodes highlighted with specific colored big dots. If users want to apply this pattern (we can imaging a more complex one) to a new tree, you don’t need to build the tree step by step. ggtree
provides an operator, %<%
, for applying the visualization pattern to a new tree.
For example, the pattern in the p
object will be applied to a new tree with 50 tips as shown below:
Another example can be found in the Tree Data Import vignette.
theme
theme_tree()
defined a totally blank canvas, while theme_tree2()
adds phylogenetic distance (via x-axis). These two themes all accept a parameter of bgcolor
that defined the background color.
Visualize a list of trees
ggtree
supports multiPhylo
object and a list of trees can be viewed simultaneously.
trees <- lapply(c(10, 20, 40), rtree)
class(trees) <- "multiPhylo"
ggtree(trees) + facet_wrap(~.id, scale="free") + geom_tiplab()
One hundred bootstrap trees can also be view simultaneously.
btrees <- read.tree(system.file("extdata/RAxML", "RAxML_bootstrap.H3", package="treeio"))
ggtree(btrees) + facet_wrap(~.id, ncol=10)
Another way to view the bootstrap trees is to merge them together to form a density tree. We can add a layer of the best tree on the top of the density tree.
References
1. Felsenstein, J. Inferring phylogenies. (Sinauer Associates, 2003).