1. Anatomy of a MPSE
MicrobiotaProcess
introduces MPSE
S4 class. This class inherits the SummarizedExperiment
(Morgan et al. 2021) class. Here, the assays
slot is used to store the rectangular abundance matrices of features for a microbiome experimental results. The colData
slot is used to store the meta-data of sample and some results about samples in the downstream analysis. The rowData
is used to store the meta-data of features and some results about the features in the downstream analysis. Compared to the SummarizedExperiment
object, MPSE
introduces the following additional slots:
- taxatree: is a
treedata
(Wang et al. 2020; Yu 2021) class contained phylo class (hierarchical structure) and tibble class (associated data) to store the taxonomy information, the tip labels of taxonomy tree are the rows of theassays
, but the internal node labels contain the differences level taxonomy of the rows of theassays
. The tibble class contains the taxonomy classification of node labels. - otutree: is also a
treedata
class to store the phylogenetic tree (based with reference sequences) and the associated data, which its tip labels are also the rows of the assays. - refseq: is a
XStringSet
(Pagès et al. 2021) class contained reference sequences, which its names are also identical with the rows of the assays.
The structure of the MPSE class.
2. Overview of the design of MicrobiotaProcess package
With this data structure, MicrobiotaProcess
will be more interoperable with the existing computing ecosystem. For example, the slots inherited SummarizedExperiment
can be extracted via the methods provided by SummarizedExperiment
. The taxatree
and otutree
can also be extracted via mp_extract_tree
, and they are compatible with ggtree
(Yu et al. 2017), ggtreeExtra
(Xu et al. 2021), treeio
(Wang et al. 2020) and tidytree
(Yu 2021) ecosystem since they are all treedata
class, which is a data structure used directly by these packages.
Moreover, the results of upstream analysis of microbiome based some tools, such as qiime2
(Bolyen et al. 2019), dada2
(Callahan et al. 2016) and MetaPhlAn
(Beghini et al. 2021) or other classes (SummarizedExperiment
(Morgan et al. 2021), phyloseq
(McMurdie and Holmes 2013) and TreeSummarizedExperiment
(Huang et al. 2021)) used to store the result of microbiome can be loaded or transformed to the MPSE
class.
In addition, MicrobiotaProcess
also introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome analysis procedures under a unified and common framework (tidy-like framework). We believe MicrobiotaProcess
can improve the efficiency of related researches, and it also bridges microbiome data analysis with the tidyverse
(Wickham et al. 2019).