Contents

1 DEBrowser:

Interactive Differential Expression Analysis Tool

2 Introduction

Differential gene expression analysis has become an increasingly popular tool in determining and viewing up and/or down experssed genes between two sets of samples. The goal of Differential gene expression analysis is to find genes or transcripts whose difference in expression, when accounting for the variance within condition, is higher than expected by chance. DESeq2 https://bioconductor.org/packages/release/bioc/html/DESeq2.html is an R package available via Bioconductor and is designed to normalize count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression (Love et al. 2014). With multiple parameters such as padjust values, log fold changes, plot styles, and so on, altering plots created with your DE data can be a hassle as well as time consuming. The Differential Expression Browser uses DESeq2 coupled with shiny to produce real-time changes within your plot queries and allows for interactive browsing of your DESeq results. In addition to DESeq analysis, DEBrowser also offers a variety of other plots and analysis tools to help visualize your data even further.

3 Quick start

Before you start;

First, you will have to install R and/or RStudio. (On Fedora/Red Hat/CentOS, these packages have to be installed; openssl-devel, libxml2-devel, libcurl-devel, libpng-devel) Running these simple commands will launch the DEBrowser within your local machine:

# Installation instructions:
# 1. Install DEBrowser and its dependencies by running the lines below
#    in R or RStudio.

source(“http://www.bioconductor.org/biocLite.R”)

biocLite("debrowser")

# 2. Load the library

library(DEBrowser)

# 3. Start DEBrowser

startDEBrowser()

4 Browsing your Data

Once you have the DEBrowser running, a page will load asking to choose a CSV file or to load the demo data. In order to run DESeq2, we are going to need gene quantifications for those genes contained in a tab-seperated values (TSV) format. The file values must contain the gene, transcript, and the samples count values you wish to enter into DEBrowser.

IE:

# TSV:

gene  transcript  exper_rep1 exper_rep2 control_rep1 control_rep2
DQ714826  uc007tfl.1  0.00  0.00  0.00  0.00
DQ551521  uc008bml.1  0.00  0.00  0.00  0.00
AK028549  uc011wpi.1  2.00  1.29  0.00  0.00

You can also view/use the demo data by clicking the ‘Load Demo!’ text as an example. For the case study demo data, feel free to download our case study demo file at http://galaxyweb.umassmed.edu/pub/DC/advanced_demo.tsv After obtaining and loading in the gene quantifications file, you are then able to view QC information of your quantifications or to continue on to running DESeq2.

alt text

Figure 1: The initial options selection.

Upon selection of QC information, you will be shown an all-to-all plot of your samples. This sample-by-sample comparison will help you visualize possible descrepencies between replicate samples, in case you may want to omit them for further analysis. To the left of this plot are various plot-shaping options you can alter to more easily view the all-to-all plot.

Additionally, two addition QC plots are available for you to use: Heatmap and PCA plots. The heatmap will display genes for each sample within your dataset in the form of a heatmap and PCA will display Principal component analysis of your dataset. You have the option of veiwing an interactive heatmap by selecting the ‘Interactive’ checkbox in the left side panel when you have selected the Heatmap option. You can select these various plot options by selecting the type of plot you wish to view on the left panel.

alt text

Figure 2: Display of the all-to-all plot in the initial QC plots page.

You can also view the genes within your quantification file in various ways. The ‘Tables’ tab will bring you to a table setup based on the dataset you have selected on the left options panel. The ‘All detected’ option lists all of the genes present within your file. The ‘Selected’ option lets your browser your gene selection based on your interactive heatmap selection. The Last option, ‘Most Varied’, will display your top N varied genes. You can alter the value of N by selecting ‘most-varied’ from the dropdown menu on the left.

alt text

Figure 3: Display of the heatmap in the initial QC plots page.

alt text

Figure 4: Display of the PCA plot in the initial QC plots page.

alt text

Figure 5: Display of most varied genes.

Upon selecting to run DESeq, you are then able to select which samples will be selected for your first condition and second condition to use for differential expression analysis. By clicking the ‘Add New Comparison’ button, you can add as many different comparisons as you want. Sample names are created based on the column headers within your data file. Once you’ve selected your comparisons, you are then ready to run DESeq2 to calculate differential expression by clicking on the ‘Submit!’ button.

alt text

Figure 6: Menus after loading in a sample.

5 Analyzing the Results

After clicking on the ‘Submit!’ button, DESeq2 will analyze your comparisons and store the results into seperate data tables. Shiny will then allow you to access this data, with multiple interactive features, at the click of a button. It is important to note that the resulting data produced from DESeq is normalized. Upon finishing the DESeq analysis, a tab-based menu will appear with multiple options.

alt text

Figure 7: List of the tabbed menus in DEBrowser.

The first tab, the ‘Main Plots’ section, is where you will be able to view the interactive results plots. On the left hand side of the screen will be the options you have to alter the padj and fold change cutoff values, what specific data set to use such as up or down regulated genes, what comparison dataset you would like to use to plot, and what type of plot you would like to view your results in. Plot choices include:

alt text

Figure 8: Main scatter plot and the zoomed in main scatterplot.

alt text

Figure 9: Main volcano Plot and the zoomed in main volcano plot.

alt text

Figure 10: Main MA plot and the zoomed in main MA plot.

Once you have selected your values, you can hit the ‘Submit!’ button to create your interactive plots!

The top left plot is whichever plot you have selected to use to analyze your results. Up-regulated genes are displayed in green while down-regulated genes are displayed in red. Hovering over a gene on this plot will display the bottom two plots: the genes normalized variation and colored by condition in the left graph, and the normalized variation between conditions within the right graph. Hovering over a gene will also display information about that gene in regards to both conditions you have selected. By clicking and dragging your mouse to create a selection over the main graph, you will create the top right plot, or the zoomed in version of your selection. If you are going to change any of the parameters on the left, please make sure to re-click the ‘Submit!’ button to update the graphs. You can also change which type of dataset to use within the main plots by selecting from the drop down dataset box. Additionally, you can further filter these datasets by typeing in the genes of interest, or regex for specific genes, to search for those specific genes within the dataset. It’s also worth noting that the plots are resizable as well as downloable.

alt text

Figure 11: The main plots page within DEBrowser.