barcode_ggheatmap_stat {barcodetrackR} | R Documentation |
Creates a heatmap from the columns of data in the Summarized Experiment object, with the option to label based on statistical analysis. Uses ggplot2.
barcode_ggheatmap_stat( your_SE, sample_size, stat_test = "chi-squared", stat_option = "subsequent", reference_sample = NULL, stat_display = "top", show_all_significant = FALSE, p_threshold = 0.05, p_adjust = "none", bc_threshold = 0, plot_labels = NULL, n_clones = 10, cellnote_assay = "stars", your_title = NULL, grid = TRUE, label_size = 12, dendro = FALSE, cellnote_size = 4, distance_method = "Euclidean", minkowski_power = 2, hclust_linkage = "complete", row_order = "hierarchical", clusters = 0, percent_scale = c(0, 2.5e-05, 0.001, 0.01, 0.1, 1), color_scale = c("#4575B4", "#4575B4", "lightblue", "#fefeb9", "#D73027", "red4"), return_table = FALSE )
your_SE |
A Summarized Experiment object. |
sample_size |
A numeric vector providing the sample size of each column of the SummarizedExperiment passed to the function. This sample size describes the samples that the barcoding data is meant to approximate. |
stat_test |
The statistical test to use on the constructed contingency table for each barcoe. Options are "chi-squared" and "fisher." |
stat_option |
For "subsequent" statistical testing is performed on each column of data compared to the column before it. For "reference," all other columns of data are compared to a reference column. |
reference_sample |
Provide the column name of the reference column if stat_option is set to "reference." Defaults to the first column in the SummarizedExperiment. |
stat_display |
Choose which clones to display on the heatmap. IF set to "top," the top n_clones ranked by abundance for each sample will be displayed. If set to "change," the top n_clones with the lowest p-value from statistical testing will be shown for each sample. If set to "increase," the top n_clones (ranked by p-value) which increase in abundance for each sample will be shown. And if set to "decrease," the top n_clones (ranked by lowest p-value) which decrease in abdundance will be shown. |
show_all_significant |
Logical. If set to TRUE when stat_display = "change," "increase," or "decrease" then the n_clones argument will be overriden and all clones with a statistically singificant change, increase, or decrease in proportion will be shown. |
p_threshold |
The p_value threshold to use for statistical testing |
p_adjust |
Character, default = "none". To correct p-values for muiltiple comparisons, set to any of the p value adjustment methods in the p.adjust function in R stats, which includes "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", and "fdr". |
bc_threshold |
Clones must be above this proportion in at least one sample to be included in statistical testing. |
plot_labels |
Vector of x axis labels. Defaults to colnames(your_SE). |
n_clones |
The top 'n' clones to plot. |
cellnote_assay |
Character. One of "stars", "reads", "proportions" or "p_val" |
your_title |
The title for the plot. |
grid |
Logical. Include a grid or not in the heatmap. |
label_size |
The size of the column labels. |
dendro |
Logical. Whether or not to show row dendrogram when hierarchical clustering. |
cellnote_size |
The numerical size of the cell note labels. |
distance_method |
Character. Use summary(proxy::pr_DB) to see all possible options for distance metrics in clustering. |
minkowski_power |
The power of the Minkowski distance (if minkowski is the distance method used). |
hclust_linkage |
Character. One of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). |
row_order |
Character; "hierarchical" to perform hierarchical clustering on the output and order in that manner, "emergence" to organize rows by order of presence in data (from left to right), or a character vector of rows within the summarized experiment to plot. |
clusters |
How many clusters to cut hierarchical tree into for display when row_order is "hierarchical". |
percent_scale |
A numeric vector through which to spread the color scale (values inclusive from 0 to 1). Must be same length as color_scale. |
color_scale |
A character vector which indicates the colors of the color scale. Must be same length as percent_scale. |
return_table |
Logical. Whether or not to return table of barcode sequences with their log abundance in the 'value' column and cellnote (* indicating statistical signficant change, for example) for each sample instead of displaying a plot. Note, for more in-depth statistical analysis, use the '"barcode_stat_test' function. |
Displays a heatmap in the current plot window. Or if return_table is set to TRUE, returns a dataframe of the barcode sequences, log abundances, and cellnote for each sample.
data(wu_subset) barcode_ggheatmap_stat( your_SE = wu_subset[, 1:4], sample_size = rep(5000, 4), stat_test = "chi-squared", stat_option = "subsequent", p_threshold = 0.05, n_clones = 10, cellnote_assay = "stars", bc_threshold = 0.005 )