getPrevalence {mia}R Documentation

Calculation prevalence information for features across samples

Description

These functions calculate the population prevalence for taxonomic ranks in a SummarizedExperiment-class object.

Usage

getPrevalence(x, ...)

## S4 method for signature 'ANY'
getPrevalence(x, detection = 0, include_lowest = FALSE, sort = FALSE, ...)

## S4 method for signature 'SummarizedExperiment'
getPrevalence(x, abund_values = "counts", as_relative = TRUE, rank = NULL, ...)

getPrevalentTaxa(x, ...)

## S4 method for signature 'ANY'
getPrevalentTaxa(x, prevalence = 50/100, include_lowest = FALSE, ...)

## S4 method for signature 'SummarizedExperiment'
getPrevalentTaxa(
  x,
  rank = NULL,
  prevalence = 50/100,
  include_lowest = FALSE,
  ...
)

getRareTaxa(x, ...)

## S4 method for signature 'ANY'
getRareTaxa(x, prevalence = 50/100, include_lowest = FALSE, ...)

## S4 method for signature 'SummarizedExperiment'
getRareTaxa(x, rank = NULL, prevalence = 50/100, include_lowest = FALSE, ...)

subsetByPrevalentTaxa(x, ...)

## S4 method for signature 'SummarizedExperiment'
subsetByPrevalentTaxa(x, rank = NULL, ...)

subsetByRareTaxa(x, ...)

## S4 method for signature 'SummarizedExperiment'
subsetByRareTaxa(x, rank = NULL, ...)

getPrevalentAbundance(x, abund_values = "relabundance", ...)

## S4 method for signature 'ANY'
getPrevalentAbundance(x, abund_values = "relabundance", ...)

## S4 method for signature 'SummarizedExperiment'
getPrevalentAbundance(x, abund_values = "counts", ...)

agglomerateByPrevalence(x, ...)

## S4 method for signature 'SummarizedExperiment'
agglomerateByPrevalence(
  x,
  rank = taxonomyRanks(x)[1L],
  other_label = "Other",
  ...
)

Arguments

x

a SummarizedExperiment object

detection

Detection threshold for absence/presence. Either an absolute value compared directly to the values of x or a relative value between 0 and 1, if as_relative = TRUE.

include_lowest

logical scalar: Should the lower boundary of the detection and prevalence cutoffs be included? (default: FALSE)

sort

logical scalar: Should the result be sorted by prevalence? (default: FALSE)

abund_values

A single character value for selecting the assay to use for prevalence calculation.

as_relative

logical scalar: Should the detection threshold be applied on compositional (relative) abundances? (default: TRUE)

rank, ...

additional arguments

  • If !is.null(rank) arguments are passed on to agglomerateByRank. See ?agglomerateByRank for more details.

  • for getPrevalentTaxa, getRareTaxa, subsetByPrevalentTaxa and subsetByRareTaxa additional parameters passed to getPrevalence

  • for getPrevalentAbundance additional parameters passed to getPrevalentTaxa

prevalence

Prevalence threshold (in 0 to 1). The required prevalence is strictly greater by default. To include the limit, set include_lowest to TRUE.

other_label

A single character valued used as the label for the summary of non-prevalent taxa. (default: other_label = "Other")

Details

getPrevalence calculates the relative frequency of samples that exceed the detection threshold. For SummarizedExperiment objects, the prevalence is calculated for the selected taxonomic rank, otherwise for the rows. The absolute population prevalence can be obtained by multiplying the prevalence by the number of samples (ncol(x)). If as_relative = TRUE the relative frequency (between 0 and 1) is used to check against the detection threshold.

The core abundance index from getPrevalentAbundance gives the relative proportion of the core species (in between 0 and 1). The core taxa are defined as those that exceed the given population prevalence threshold at the given detection level as set for getPrevalentTaxa.

subsetPrevalentTaxa and subsetRareTaxa return a subset of x. The subset includes the most prevalent or rare taxa that are calculated with getPrevalentTaxa or getRareTaxa respectively.

getPrevalentTaxa returns taxa that are more prevalent with the given detection threshold for the selected taxonomic rank.

getRareTaxa returns complement of getPrevalentTaxa.

Value

subsetPrevalentTaxa and subsetRareTaxa return subset of x.

All other functions return a named vectors:

Author(s)

Leo Lahti For getPrevalentAbundance: Leo Lahti and Tuomas Borman. Contact: microbiome.github.io

References

A Salonen et al. The adult intestinal core microbiota is determined by analysis depth and health status. Clinical Microbiology and Infection 18(S4):16 20, 2012. To cite the R package, see citation('mia')

See Also

agglomerateByRank, getTopTaxa

Examples

data(GlobalPatterns)
tse <- GlobalPatterns
# Get prevalence estimates for individual ASV/OTU
prevalence.frequency <- getPrevalence(tse,
                                      detection = 0,
                                      sort = TRUE,
                                      as_relative = TRUE)
head(prevalence.frequency)

# Get prevalence estimates for phylums
# - the getPrevalence function itself always returns population frequencies
prevalence.frequency <- getPrevalence(tse,
                                      rank = "Phylum",
                                      detection = 0,
                                      sort = TRUE,
                                      as_relative = TRUE)
head(prevalence.frequency)

# - to obtain population counts, multiply frequencies with the sample size,
# which answers the question "In how many samples is this phylum detectable"
prevalence.count <- prevalence.frequency * ncol(tse)
head(prevalence.count)

# Detection threshold 1 (strictly greater by default);
# Note that the data (GlobalPatterns) is here in absolute counts
# (and not compositional, relative abundances)
# Prevalence threshold 50 percent (strictly greater by default)
prevalent <- getPrevalentTaxa(tse,
                              rank = "Phylum",
                              detection = 10,
                              prevalence = 50/100,
                              as_relative = FALSE)
head(prevalent)

# Gets a subset of object that includes prevalent taxa
altExp(tse, "prevalent") <- subsetByPrevalentTaxa(tse,
                                       rank = "Family",
                                       detection = 0.001,
                                       prevalence = 0.55,
                                       as_relative = TRUE)
altExp(tse, "prevalent")                                 

# getRareTaxa returns the inverse
rare <- getRareTaxa(tse,
                    rank = "Phylum",
                    detection = 1/100,
                    prevalence = 50/100,
                    as_relative = TRUE)
head(rare)

# Gets a subset of object that includes rare taxa
altExp(tse, "rare") <- subsetByRareTaxa(tse,
                             rank = "Class",
                             detection = 0.001,
                             prevalence = 0.001,
                             as_relative = TRUE)
altExp(tse, "rare")      

# Names of both experiments, prevalent and rare, can be found from slot altExpNames
tse
                         
data(esophagus)
getPrevalentAbundance(esophagus, abund_values = "counts")

# data can be aggregated based on prevalent taxonomic results
agglomerateByPrevalence(tse,
                        rank = "Phylum",
                        detection = 1/100,
                        prevalence = 50/100,
                        as_relative = TRUE)

[Package mia version 1.2.2 Index]