getStatistics {BLMA} | R Documentation |
Calculate genes summary statistic across multiple datasets
getStatistics(allGenes, dataList, groupList, ncores = 1, method = addCLT)
allGenes |
Vector of all genes names for the analysis. |
dataList |
A list of expression matrices, in which rows are genes and columns are samples. |
groupList |
A list of vectors indicating sample group corresponding with expression matrices in dataList. |
ncores |
Number of core to use in prallel processing. |
method |
Function for combining p-values. It must accept one input which is a vector of p-values and return a combined p-value. Three methods are embeded in this package are addCLT, fisherMethod, and stoufferMethod. |
To estimate the effect sizes of genes across all studies, first standardized mean difference for each gene in individual studies is compute. Next, the overall efect size and standard error are estimated using the random-efects model. This overall efect size represents the gene's expression change under the efect of the condition. The, z-scores and p-values of observing such efect sizes are computed. The p-values is obtained from classical hypothesis testing. By default, linear model and empirical Bayesian testing \(limma\) are used to compute the p-values for diferential expression. The two-tailed p-values are converted to one-tailed p-values (lef- and right-tailed). For each gene, the one-tailed p-values across all datasets are then combined using the addCLT, stouffer or fisher method. These p-values represent how likely the diferential expression is observed by chance.
A data.frame of gene statistics with following columns:
Two-tailed p-values
Two-tailed p-values with false discovery rate correction
left-tailed p-values
left-tailed p-values with false discovery rate correction
right-tailed p-values with false discovery rate correction
right-tailed p-values
Effect size
Two-tailed p-values for effect size
Two-tailed p-values for effect size with false discovery rate correction
Left-tailed p-values for effect size
Left-tailed p-values for effect size with false discovery rate correction
Right-tailed p-values for effect size
Right-tailed p-values for effect size with false discovery rate correction
Tin Nguyen, Hung Nguyen, and Sorin Draghici
Nguyen, T., Shafi, A., Nguyen, T. M., Schissler, A. G., & Draghici, S. (2020). NBIA: a network-based integrative analysis framework-applied to pathway analysis. Scientific reports, 10(1), 1-11. Nguyen, T., Tagett, R., Donato, M., Mitrea, C., & Draghici, S. (2016). A novel bi-level meta-analysis approach: applied to biological pathway analysis. Bioinformatics, 32(3), 409-416. Smyth, G. K. (2005). Limma: linear models for microarray data. In Bioinformatics and computational biology solutions using R and Bioconductor (pp. 397-420). Springer, New York, NY.
datasets <- c("GSE17054", "GSE57194", "GSE33223", "GSE42140") data(list = datasets, package = "BLMA") dataList <- lapply(datasets, function(dataset) { get(paste0("data_", dataset)) }) groupList <- lapply(datasets, function(dataset) { get(paste0("group_", dataset)) }) names(dataList) <- datasets names(groupList) <- datasets allGenes <- Reduce(intersect, lapply(dataList, rownames)) geneStat <- getStatistics(allGenes, dataList, groupList) head(geneStat) # perform pathway analysis library(ROntoTools) # get gene network kpg <- loadKEGGPathways()$kpg # get gene network name kpn <- loadKEGGPathways()$kpn # get geneset gslist <- lapply(kpg,function(y) nodes(y)) # get differential expressed genes DEGenes.Left <- rownames(geneStat)[geneStat$pLeft < 0.05 & geneStat$ES.pLeft < 0.05] DEGenes.Right <- rownames(geneStat)[geneStat$pRight < 0.05 & geneStat$ES.pRight < 0.05] DEGenes <- union(DEGenes.Left, DEGenes.Right) # perform pathway analysis with ORA oraRes <- lapply(gslist, function(gs){ pORACalc(geneSet = gs, DEGenes = DEGenes, measuredGenes = rownames(geneStat)) }) oraRes <- data.frame(p.value = unlist(oraRes), pathway = names(oraRes)) rownames(oraRes) <- kpn[rownames(oraRes)] # print results print(head(oraRes)) # perfrom pathway analysis with Pathway-Express from ROntoTools ES <- geneStat[DEGenes, "ES"] names(ES) <- DEGenes peRes = pe(x = ES, graphs = kpg, ref = allGenes, nboot = 1000, seed=1) peRes.Summary <- Summary(peRes, comb.pv.func = fisherMethod) peRes.Summary[, ncol(peRes.Summary) + 1] <- rownames(peRes.Summary) rownames(peRes.Summary) <- kpn[rownames(peRes.Summary)] colnames(peRes.Summary)[ncol(peRes.Summary)] = "pathway" # print results print(head(peRes.Summary))