simplifyGOFromMultipleLists {simplifyEnrichment}R Documentation

Perform simplifyGO analysis with multiple lists of GO IDs

Description

Perform simplifyGO analysis with multiple lists of GO IDs

Usage

simplifyGOFromMultipleLists(lt, go_id_column = NULL, padj_column = NULL, padj_cutoff = 1e-2,
    filter = function(x) any(x < padj_cutoff), default = 1,
    ont = NULL, db = 'org.Hs.eg.db', measure = "Rel",
    heatmap_param = list(NULL),
    method = "binary_cut", control = list(partial = TRUE),
    min_term = NULL, verbose = TRUE, column_title = NULL, ...)

Arguments

lt

A data frame, a list of numeric vectors (e.g. adjusted p-values) where each numeric vector has GO IDs as names, or a list of GO IDs.

go_id_column

Column index of GO ID if lt contains a list of data frames.

padj_column

Column index of adjusted p-values if lt contains a list of data frames.

padj_cutoff

Cut off for adjusted p-values

filter

A self-defined function for filtering GO IDs. By default it requires GO IDs should be significant in at least one list.

default

The default value for the adjusted p-values. See Details.

ont

GO ontology. Value should be one of "BP", "CC" or "MF". If it is not specified, the function automatically identifies it by random sampling 10 IDs from go_id (see guess_ont).

db

Annotation database. It should be from https://bioconductor.org/packages/3.10/BiocViews.html#___OrgDb

measure

Semantic measure for the GO similarity, pass to termSim.

heatmap_param

Parameters for controlling the heatmap, see Details.

method

Pass to simplifyGO.

control

Pass to simplifyGO.

min_term

Pass to simplifyGO.

verbose

Pass to simplifyGO.

column_title

Pass to simplifyGO.

...

Pass to simplifyGO.

Details

The input data can have three types of formats:

Now let's assume there are n GO lists, we first construct a global matrix where columns correspond to the n GO lists and rows correspond to the "union" of all GO IDs in the lists. The value for the ith GO ID and in the jth list are taken from the corresponding numeric vector in lt. If the jth vector in lt does not contain the ith GO ID, the value defined by default argument is taken there (e.g. in most cases the numeric values are adjusted p-values, default is set to 1). Let's call this matrix as M0.

Next step is to filter M0 so that we only take a subset of GO IDs of interest. We define a proper function via argument filter to remove GO IDs that are not important for the analysis. Functions for filter is applied to every row in M0 and filter function needs to return a logical value to decide whether to remove the current GO ID. For example, if the values in lt are adjusted p-values, the filter function can be set as function(x) any(x < padj_cutoff) so that the GO ID is kept as long as it is signfiicant in at least one list. After the filter, let's call the filtered matrix M1.

GO IDs in M1 (row names of M1) are used for clustering. A heatmap of M1 is attached to the left of the GO similarity heatmap so that the group-specific (or list-specific) patterns can be easily observed and to corresponded to GO functions.

Argument heatmap_param controls several parameters for heatmap M1:

Examples


# perform functional enrichment on the signatures genes from cola anlaysis 
require(cola)
data(golub_cola) 
res = golub_cola["ATC:skmeans"]
require(hu6800.db)
x = hu6800ENTREZID
mapped_probes = mappedkeys(x)
id_mapping = unlist(as.list(x[mapped_probes]))
lt = functional_enrichment(res, k = 3, id_mapping = id_mapping) # you can check the value of `lt`

# a list of data frames
simplifyGOFromMultipleLists(lt, padj_cutoff = 0.001)

# a list of numeric values
lt2 = lapply(lt, function(x) structure(x$p.adjust, names = x$ID))
simplifyGOFromMultipleLists(lt2, padj_cutoff = 0.001)

# a list of GO IDS
lt3 = lapply(lt, function(x) x$ID[x$p.adjust < 0.001])
simplifyGOFromMultipleLists(lt3)


[Package simplifyEnrichment version 1.3.0 Index]