Key Features of `MultiAssayExperiment`

Component slots

colData - biological units

A DataFrame describing the characteristics of the biological units. In The Cancer Genome Atlas data, for example, the biological units are patients.

Key points:

One row per patient
Zero or more observations in each experiment

pheno <- DataFrame(id = 1:4, type = c("a", "a", "b", "b"),
                   sex = c("M", "F", "M", "F"),
                   row.names = c("Bob", "Sandy", "Jake", "Lauren"))

ExperimentList - experiment data

A base list or ExperimentList object containing the experimental datasets for the set of samples collected. This gets converted into a class ExperimentList during construction.

Key points:

Included data classes must support: [, dimnames, dim
Genomic range-based or ID-based data
Support open-ended set of data clases

dataset1 <- matrix(rnorm(20, 5, 1), ncol = 5,
                  dimnames = list(paste0("GENE", 4:1),
                                  paste0("sample", LETTERS[1:5])))
dataset2 <- matrix(rnorm(12, 3, 2), ncol = 3,
                   dimnames = list(paste0("ENST0000", 1:4),
                                   paste0("samp", letters[1:3])))

expList <- list(exp1 = dataset1, exp2 = dataset2)
expList

## $exp1
##        sampleA  sampleB  sampleC  sampleD  sampleE
## GENE4 3.720379 4.265752 4.162123 4.733160 4.088411
## GENE3 4.611418 5.330292 3.975553 3.765205 3.353615
## GENE2 5.756995 4.737932 5.380220 5.267235 4.974819
## GENE1 4.578920 4.355404 2.792270 3.555878 4.809017
## 
## $exp2
##              sampa        sampb       sampc
## ENST00001 6.374843  3.665216509  5.03255396
## ENST00002 2.078270  7.911929690  0.01464498
## ENST00003 3.923976  5.670601129 -2.35608685
## ENST00004 3.141602 -0.009088183  2.64769758

sampleMap - relationship graph

A DataFrame graph representation of the relationship between the experiments (assay column name), biological units (primary), and samples (colname). Helper functions are available for creating a map from a list. See ?listToMap

Key points: * relates experimental observations (colnames) to colData * permits experiment-specific sample naming, missing, and replicate observations

map1 <- DataFrame(primary = c("Bob", "Jake", "Sandy", "Sandy", "Lauren"),
                  colname = paste0("sample", LETTERS[1:5]))
map2 <- DataFrame(primary = c("Jake", "Sandy", "Lauren"),
                  colname = paste0("samp", letters[1:3]))
sampMap <- listToMap(list(exp1 = map1, exp2 = map2))
sampMap

## DataFrame with 8 rows and 3 columns
##      assay     primary     colname
##   <factor> <character> <character>
## 1     exp1         Bob     sampleA
## 2     exp1        Jake     sampleB
## 3     exp1       Sandy     sampleC
## 4     exp1       Sandy     sampleD
## 5     exp1      Lauren     sampleE
## 6     exp2        Jake       sampa
## 7     exp2       Sandy       sampb
## 8     exp2      Lauren       sampc

MultiAssayExperiment - class constructor function

The MultiAssayExperiment constructor function can take three arguments:

experiments - An ExperimentList or list of data
colData - A DataFrame describing the biological units
sampleMap - A DataFrame of assay, primary, and colname identifiers

(mae <- MultiAssayExperiment(expList, pheno, sampMap))

## A MultiAssayExperiment object of 2 listed
##  experiments with user-defined names and respective classes. 
##  Containing an ExperimentList class object of length 2: 
##  [1] exp1: matrix with 4 rows and 5 columns 
##  [2] exp2: matrix with 4 rows and 3 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert ExperimentList into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

Subsetting

Single bracket `[`

In pseudo code below, the subsetting operations work on the rows of the following indices: 1. i experimental data rows 2. j the primary names or the column names (entered as a list or List) 3. k assay

multiassayexperiment[i = rownames, j = primary or colnames, k = assay]

Examples:

mae[c("GENE4", "ENST00002"), , ]

## A MultiAssayExperiment object of 2 listed
##  experiments with user-defined names and respective classes. 
##  Containing an ExperimentList class object of length 2: 
##  [1] exp1: matrix with 1 rows and 5 columns 
##  [2] exp2: matrix with 1 rows and 3 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert ExperimentList into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

mae[, c("Bob", "Jake", "Sandy"), ]

## harmonizing input:
##   removing 2 sampleMap rows with 'colname' not in colnames of experiments
##   removing 1 colData rownames not in sampleMap 'primary'

## A MultiAssayExperiment object of 2 listed
##  experiments with user-defined names and respective classes. 
##  Containing an ExperimentList class object of length 2: 
##  [1] exp1: matrix with 4 rows and 4 columns 
##  [2] exp2: matrix with 4 rows and 2 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert ExperimentList into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

mae[, , "exp1"]

## A MultiAssayExperiment object of 1 listed
##  experiment with a user-defined name and respective class. 
##  Containing an ExperimentList class object of length 1: 
##  [1] exp1: matrix with 4 rows and 5 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert ExperimentList into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

Double bracket `[[`

The “double bracket” method ([[) is a convenience function for extracting a single element of the MultiAssayExperiment ExperimentList. It avoids the use of experiments(mae)[[1L]]. For example:

mae[[1L]]

##        sampleA  sampleB  sampleC  sampleD  sampleE
## GENE4 3.720379 4.265752 4.162123 4.733160 4.088411
## GENE3 4.611418 5.330292 3.975553 3.765205 3.353615
## GENE2 5.756995 4.737932 5.380220 5.267235 4.974819
## GENE1 4.578920 4.355404 2.792270 3.555878 4.809017

will extract the first experiment in the ExperimentList in the class that it was stored in.

Extraction

assay and assays

The assay and assays methods follow SummarizedExperiment convention. The assay (singular) method will extract the first element of the ExperimentList and will return a matrix.

assay(mae)

##        sampleA  sampleB  sampleC  sampleD  sampleE
## GENE4 3.720379 4.265752 4.162123 4.733160 4.088411
## GENE3 4.611418 5.330292 3.975553 3.765205 3.353615
## GENE2 5.756995 4.737932 5.380220 5.267235 4.974819
## GENE1 4.578920 4.355404 2.792270 3.555878 4.809017

The assays (plurar) method will return a SimpleList of the data with each element being a matrix.

assays(mae)

## List of length 2
## names(2): exp1 exp2

Slot accession

Each slot in the MultiAssayExperiment has its convenient accessor function. See the table below.

Slot	Accessor
`ExperimentList`	`experiments`
`colData`	`colData` / `$` *
`sampleMap`	`sampleMap`
`metadata`	`metadata`

__*__ The $ operator on a MultiAssayExperiment will return a single column of colData. For example:

mae$sex

## [1] "M" "F" "M" "F"

Transformations

`longFormat` & `wideFormat`

The longFormat or wideFormat functions will “reshape” and combine your data into one DataFrame. This is accomplished using either the long or wide format function.

longFormat(mae)

## DataFrame with 32 rows and 5 columns
##     assay     rowname colname        value primary
##     <Rle> <character>   <Rle>    <numeric>   <Rle>
## 1    exp1       GENE4 sampleA     3.720379     Bob
## 2    exp1       GENE3 sampleA     4.611418     Bob
## 3    exp1       GENE2 sampleA     5.756995     Bob
## 4    exp1       GENE1 sampleA     4.578920     Bob
## 5    exp1       GENE4 sampleB     4.265752    Jake
## ...   ...         ...     ...          ...     ...
## 28   exp2   ENST00004   sampb -0.009088183   Sandy
## 29   exp2   ENST00001   sampc  5.032553957  Lauren
## 30   exp2   ENST00002   sampc  0.014644979  Lauren
## 31   exp2   ENST00003   sampc -2.356086852  Lauren
## 32   exp2   ENST00004   sampc  2.647697584  Lauren

For a wide dataset, use the wideFormat function.

wideFormat(mae)[, 1:4]

## DataFrame with 4 rows and 4 columns
##    primary exp1_GENE1_sampleA exp1_GENE1_sampleB exp1_GENE1_sampleC
##   <factor>          <numeric>          <numeric>          <numeric>
## 1      Bob            4.57892                 NA                 NA
## 2     Jake                 NA           4.355404                 NA
## 3   Lauren                 NA                 NA                 NA
## 4    Sandy                 NA                 NA            2.79227

`c` - combine

The c function allows the user to insert an additional experiment into an already created MultiAssayExperiment.

A sampleMap can be provided using in order to map colData rows to experiment column names. In the following example, the “exp3” experiment contains repeated measurements for Bob.

(maec1 <- c(x = mae,
  exp3 = matrix(rnorm(10), ncol = 5,
                dimnames = list(paste0("GENE", c("A", "B")),
                                paste0("sample", LETTERS[1:5]))),
  sampleMap = DataFrame(assay = "exp3",
                        primary = c("Bob", "Bob", "Sandy", "Jake", "Lauren"),
                        colname = paste0("sample", LETTERS[1:5])
                        )
  ))

## A MultiAssayExperiment object of 3 listed
##  experiments with user-defined names and respective classes. 
##  Containing an ExperimentList class object of length 3: 
##  [1] exp1: matrix with 4 rows and 5 columns 
##  [2] exp2: matrix with 4 rows and 3 columns 
##  [3] exp3: matrix with 2 rows and 5 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert ExperimentList into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

sampleMap(maec1)

## DataFrame with 13 rows and 3 columns
##        assay     primary     colname
##     <factor> <character> <character>
## 1       exp1         Bob     sampleA
## 2       exp1        Jake     sampleB
## 3       exp1       Sandy     sampleC
## 4       exp1       Sandy     sampleD
## 5       exp1      Lauren     sampleE
## ...      ...         ...         ...
## 9       exp3         Bob     sampleA
## 10      exp3         Bob     sampleB
## 11      exp3       Sandy     sampleC
## 12      exp3        Jake     sampleD
## 13      exp3      Lauren     sampleE

For convenience, the mapFrom argument allows the user to map from a particular experiment provided that the order of the colnames is in the same. A warning will be issued to make the user aware of this assumption.

(maec2 <- c(x = mae,
  exp3 = matrix(rnorm(10), ncol = 5,
                dimnames = list(paste0("GENE", c("A", "B")),
                                paste0("sample", LETTERS[1:5]))),
  mapFrom = 1L))

## Warning in .local(x, ...): Assuming column order in the data provided 
##  matches the order in 'mapFrom' experiment(s) colnames

## A MultiAssayExperiment object of 3 listed
##  experiments with user-defined names and respective classes. 
##  Containing an ExperimentList class object of length 3: 
##  [1] exp1: matrix with 4 rows and 5 columns 
##  [2] exp2: matrix with 4 rows and 3 columns 
##  [3] exp3: matrix with 2 rows and 5 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert ExperimentList into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

`prepMultiAssay` - Constructor function helper

The prepMultiAssay function allows the user to diagnose typical problems when creating a MultiAssayExperiment object. See ?prepMultiAssay for more details.

MultiAssayExperiment: Quick Start Guide

Marcel Ramos

May 18, 2017

Contents

Key Features of `MultiAssayExperiment`

Component slots

colData - biological units

ExperimentList - experiment data

sampleMap - relationship graph

MultiAssayExperiment - class constructor function

Subsetting

Single bracket `[`

Double bracket `[[`

Extraction

assay and assays

Slot accession

Transformations

`longFormat` & `wideFormat`

`c` - combine

`prepMultiAssay` - Constructor function helper

Session info

MultiAssayExperiment: Quick Start Guide

Marcel Ramos

May 18, 2017

Contents

Key Features of MultiAssayExperiment

Component slots

colData - biological units

ExperimentList - experiment data

sampleMap - relationship graph

MultiAssayExperiment - class constructor function

Subsetting

Single bracket [

Double bracket [[

Extraction

assay and assays

Slot accession

Transformations

longFormat & wideFormat

c - combine

prepMultiAssay - Constructor function helper

Session info

Key Features of `MultiAssayExperiment`

Single bracket `[`

Double bracket `[[`

`longFormat` & `wideFormat`

`c` - combine

`prepMultiAssay` - Constructor function helper