bootstrap_enrichment_test {EWCE} | R Documentation |
bootstrap_enrichment_test
takes a genelist and a single cell type
transcriptome dataset and determines the probability of enrichment and fold
changes for each cell type.
bootstrap_enrichment_test( sct_data = NA, hits = NA, bg = NA, genelistSpecies = "mouse", sctSpecies = "mouse", reps = 100, annotLevel = 1, geneSizeControl = FALSE, controlledCT = NULL )
sct_data |
List generated using |
hits |
Array of MGI gene symbols containing the target gene list. Must be HGNC symbols if geneSizeControl=TRUE |
bg |
Array of MGI gene symbols containing the background gene list. Must be HGNC symbols if geneSizeControl=TRUE |
genelistSpecies |
Either 'mouse' or 'human' depending on whether MGI or HGNC symbols are used for gene lists |
sctSpecies |
Either 'mouse' or 'human' depending on whether MGI or HGNC symbols are used for the single cell dataset |
reps |
Number of random gene lists to generate (default=100 but should be over 10000 for publication quality results) |
annotLevel |
an integer indicating which level of the annotation to analyse. Default = 1. |
geneSizeControl |
a logical indicating whether you want to control for GC content and transcript length. Recommended if the gene list originates from genetic studies. Default is FALSE. If set to TRUE then human gene lists should be used rather than mouse. |
controlledCT |
(optional) If not NULL, and instead is the name of a cell type, then the bootstrapping controls for expression within that cell type |
A list containing three data frames:
results
: dataframe in which each row gives the statistics
(p-value, fold change and number of standard deviations from the mean)
associated with the enrichment of the stated cell type in the gene list
hit.cells
: vector containing the summed proportion of
expression in each cell type for the target list
bootstrap_data
: matrix in which each row represents the
summed proportion of expression in each cell type for one of the
random lists
library(ewceData) # Load the single cell data ctd <- ctd() # Set the parameters for the analysis # Use 3 bootstrap lists for speed, for publishable analysis use >10000 reps <- 3 # Load the gene list and get human orthologs example_genelist <- example_genelist() mouse_to_human_homologs <- mouse_to_human_homologs() m2h <- unique(mouse_to_human_homologs[, c("HGNC.symbol", "MGI.symbol")]) mouse.hits <- unique(m2h[m2h$HGNC.symbol %in% example_genelist, "MGI.symbol"]) #subset mouse.bg for speed but ensure it still contains the hits mouse.bg <- unique(c(m2h$MGI.symbol[1:100],mouse.hits)) # Bootstrap significance test, no control for transcript length or GC content full_results <- bootstrap_enrichment_test( sct_data = ctd, hits = mouse.hits, bg = mouse.bg, reps = reps, annotLevel = 2, sctSpecies = "mouse", genelistSpecies = "mouse" )