Tracks:
Features:
Files must be formatted according to UCSC guidelines. All widely used chromosome names conventions are accepted, e.g. for human files either ‘chr1’ or ‘1’ can be used, however these conventions should not be mixed within single files.
Press the Add files
button to bring up the file upload panel.
You can drag and drop files here or press the Add files...
button to open a file selection menu. Before starting the upload the following mandatory information must be provided about each file:
Comments are optional.
The contents of the text field can be copied to all files by clicking the icon at the left of the field. The default values can be set using Set defaults...
button. Default values are stored using the browser cookies, and the settings will be remembered across different sessions as long as the same web browser is used. File extensions that are not supported will raise an error.
Individual files can be uploaded by pressing ‘start’ next to the file name or all files can be uploaded at once by pressing the Start upload
button at the top of file upload panel.
During the upload process a progress bar is displayed. After upload SeqPlots gives a message that upload was successful or or gives an error message. Common errors are misformatted file formats or chromosome names do not matched the reference genome. For more information please refer to Error explained chapter.
To dismiss the upload window, click on X
or outside the window.
Clicking the New plot set
button brings up the file collection window. The primary function of this window is to choose signal tracks and feature files to use for calculating the plots. However, it also provides basic file management capabilities. Information on files can be reviewed and files can be downloaded or deleted. Fields can be searched, filtered and sorted by any column. The red x
button on the right site of file table removes a single file from the collection, while Remove selected files
button will erase all selected files.
Pressing the New plot set
button brings up the file collection window from which you can choose signal tracks and feature files to calculate average plots and heat maps. If you wish to upload more files please refer to adding new files documentation. The file collection window has three tabs:
Tracks
- signal files, i.e., Wiggle, BigWiggle and BedGraph files.Features
- genomic feature files, i.e., BED, GFF and GTF filesSequence features
- input any motif of interest that you want to plot.The Tracks
and Features
tabs displays information about the files and allows you to filter and sort by any column. The “Search:” dialog allows you to find any keyword in any field, while dropdowns below the file grid allow for more advanced filtering on specific columns.
Select files by clicking on the file name or any other part of the row beside Show comment
and Download
or Remove
buttons. Chosen files are highlighted in light blue. Clicking the file name again will cancel the selection. At least one signal track or motif and one feature file must be selected before starting the calculation.
Options controlling the plot settings is found below the file selection window:
Bin track @ [bp]:
- this numeric input determines the resolution of data acquisition; the default value 10 means that 10bp intervals within the plotting range will be summarized by calculating the mean. Higher values increases the speed of calculation, but decreases resolution. See the explanations.Choose the plot type
- there are three options:
Point Features
- anchor plot on the start of a feature. By default, plot will be directional if strand information is present (i.e, use start position and plot on positive strand for + strand features and use end position and plot on negative strand for minus strand features). If strand information is not present in the feature file (or if the “ignore strand” option is chosen), plot will use start position of feature and be plotted on the positive strand (see explanations). User chooses length of upstream and downstream sequence to plot.Midpoint Features
- similar to point feature, but plot is centered on the midpoint of the feature.Endpoint Features
- similar to point feature, but plot is centered on the end of the feature. Strand information is used by default to determine the end side.Anchored Features
- features are anchored at start and stop positions and given pseudo-length chosen by the user. Additionally, the user chooses the length of sequence upstream of the start and downstream of the end to plot.Ignore strand
- the directionality (strand) will be ignored all features plotted on the positive strand.Ignore zeros
- signal values of 0 in the track will be be excluded from calculationsCalculate heatmap
- selecting this generates and saves a heat map matrix. Select if you wish to generate heatmap; uncheck if you only wish to generate average plots, as this will speed calculations.Plotting distances in [bp]
- the distances in to be plotted:
Upstream
- the plotting distance in base pairs upstream to the featureAnchored
- the pseudo-length, to which the features will be extended or shrunk using linear approximation (only for anchored plots)Downstream
- the plotting distance in base pairs downstream to the featureThe Sequence features
tab allows you to calculate and plot the density of any user-defined motif around the chosen genomic feature using the reference sequence package. Motif plots can be mixed with track files’ signal plots. The following options can be set:
DNA motif
- the DNA motifSliding window size in base pairs [bp]
- the size of the sliding window for motif calculation. The value (number of matching motifs within the window) is reported in the middle of the window, e.g. if window is set to 200bp, DNA motif is “GC” and there are 8 CpGs in first 200 bp of the chromosome the value 8 will be reported at 100th bp.Display name
- The name of the motif that will be shown in key and heatmap labels. Leave blank to use DNA motif
value.Plot heatmap or error estimates
- this checkbox determines if heatmap matrix and error estimates should be calculated. If unchecked much faster algorithm will be used for motif density calculation, but only the average plot without the error estimates will be available.Match reverse complement as well
- select if reverse complement motif should be reported as well. For example the TATA motif will report both TATA and ATAT with this option selected.Clicking Add
button adds the motif to plot set, while Reset All
clears the motif selection. On the right side of the motif setting panel gives a list summary of included motifs.
The options are executed by pressing the Run calculation
button. This dismisses the file collection window and brings up the calculation dialog, which shows the progress. On Linux and Mac OS X (systems supporting fork based parallelization) the calculation can be stopped using the Cancel
button - this will bring back all settings in file collection window.
After successful execution the plot array and plot preview panel will appear. In case of error an informative error pop-up will explain the problem. Please refer to the error section for further information.
This section focuses on average (line) plots and options common between these and heatmaps. For heatmap options please refer to heatmap documentation.
After calculating or loading a plot set, a plot array of checkboxes is displayed to select the desired pairs of features and tracks/motifs. Clicking on the column name (tracks/motifs) or row name (features) selects/deselects the whole column or row. Clicking on top-left most cell of plot array toggles the selection of whole array.
If at least one pair on plot array is selected pressing the Line plot
button produces an average plot preview and the Heatmap
button produces a heatmap preview. Alternatively, pressing the [RETURN] key will also produce the new selection and options. These operations are done automatically in reactive mode (see Advanced options chapter). Plots can be downloaded as PDF files using the Line plot and Heatmap buttons next to Download (at the top of the panel).
Below the plotting buttons are options for labeling plots and setting axes. On application start the first panel responsible for bringing file upload, management and plot set calculation modals is active. The further three panels hold common plot settings.
This panel groups settings influencing the plot main title, axis labels, various font sizes plus vertical and horizontal plot limits.
Title
- The main title of the plot, shown in top-center part of the figure; default emptyX-axis label
- Label shown below horizontal axis; default emptyY-axis label
- Label shown below vertical axis; default emptyTitle font size
- Font size of the title in points (point = ~1/72 an inch for standard A4 output); default 20 pointsLabels font size
- Font size of axis labels in points; default 16 pointsAxis font size
- Controls axis ticks font size, that is size of the numbers indicating position in base pairs on X-axis and means signal value on X-axis; default 14 pointsSet X-axis limits
- Set hard plotting limits for X-axis; default values are whole range chosen during plot set calculationSet Y-axis limits
- Set hard plotting limits for Y-axis; default values are a range between lowest and highest mean signal extended by error estimateControls in this panel controls the display of guide lines and error estimates, and allows to log scale the signal prior to plotting.
Transform signal
- if set to Log2 transform
performs log2 transformation of the signal prior to plotting; default setting is Do not transform
Show vertical guide line
- show the vertical line at point 0 - beginning of the feature or midpoint and end of the pseudo-length scaled features (only for anchored plots); turn on by defaultShow horizontal guide line
- show the horizontal line at user determined height; turn off by defaultShow error estimates
- show error standard error and 95% confidence interval as fields, if turned off only the line representing the mean signal is shown; turn on by defaultThis panel groups two types of controls. Colors
, Label
and Priority/Order
are a checkboxes revealing further controls on plot set grid, specific for a feature-track pair or sub-heatmap. Show plot key
, Show error estimate key
and Legend font size
re global controls specific for average plots. Inputs on plot set grid do not have specific labels, but the tooltip explaining their meaning is shown on mouse cursor hover.
Colors
- checkboxes revealing a color picker on plot set grid. This input allows to control the colors of specific feature-track pair average plots or sub-heatmaps. In browser supporting the color picker ‘e.g Chrome’ the system dialog will show up. In other browsers (e.g. Firefox) the javaScript color picker will be initialized.Label
- checkboxes revealing a label text input plot set grid. This controls the names shown on the key with average plots or the heatmap top labels.Priority/Order
- checkboxes revealing numeric input on plot set grid. These number determine the order of average plots and hetamaps. Feature-track pair with the highest priority will be listed on the top of key for average plots and left-most for heatmaps.Show plot key
- shows the key giving the color to feature-track pair label mapping. If turned on the additional drop-down allows to choose the position on the plot, top-right by defaultShow error estimate key
- shows the key gexplaining the meaning of error fields. If turnedon the additional drop-down allows to choose the position on the plot, top-left by defaultLegend font size
- set the size of font used to plot the keys; 12 default Heatmaps ========Heatmaps are often more informative than average plots. If there is variability in signal along individual instances of a given genomic feature (e.g., because there are different biological classes), an average plot might not represent the behavior of any individual feature and could even give a misleading picture. SeqPlots plots track-feature pairs as sub-heatmaps horizontally aligned on single figure. All sub-heatmaps must have the same number of data rows, hence in single plot mode simultaneous plotting is possible only on single features or feature files containing exact same number of rows. The heatmaps can be sorted and clustered by k-means, hierarchical clustering or super self organising maps (SupreSOM).
This tab has heatmap specific options for data processing and display.
Sort heatmap rows by mean signal
- sorts the heatmap rows based on the mean value of each row across included sub-heatmps. Can be set to increasing or decreasing order. Turned off by default.
Clustering algorithm
- choose clustering algorithm (k-means, hierarchical or SupreSOM). If clustering is not desired, choose do not cluster
, which uses the feature file in the uploaded order . K-means by default.
Make cluster calculation repeatable
- enforces, that clustering with non-deterministic algorithms, like k-means or SupreSOM will generate the same results as most recently plotted heatmap. This is achieved by re-using R random number generator seed.
Plot selected cluster
- this option is available only if Make cluster calculation repeatable
is turned on. Allows to select one of the clusters and zoom it to whole plot height. Plot all clusters by default.
Choose individual heatmaps for sorting/clustering
- this checkbox brings up a new control panel on the plot set grid to determine if a given sub-heatmap should be included in plotting and/or clustering. The excluded sub-plots will be plotted in the order of the other sub-heatmaps, but their values will not influence the clustering/sorting. By default all sub-heatmaps are included.
Heatmaps have individual color keys
- by default all sub-heatmap have their own color keys. This option determines if each sub-heatmap should have a separate color key (plotted below the heatmap) or a single, common key should be calculated for all sub-plots (plotted rightmost). The example below show the difference between separate (left) and common (right) color keys:
Set default color key limits
- this option determines the limits in mapping the numerical values to the colors. The range of colors generated is dependent on these options. Values lower or higher than the given limits will be plotted in the limit value color. If this checkbox is not selected, limits are auto-generated using Color key scaling
parameter. If this option it turned off two numerical fields, min
and max
, are shown to manually set the limits.
Color key scaling
- this slider influences how color key limits are generated. For example, 0.01 (default value) calculates limits using data ranging from 1-99 percentile of available data points. 0.1 uses data ranging from 10-90 percentile. The general formula for limit is: [quantile(data, Color key scaling
); quantile(data, 1-Color key scaling
)]min
and max
numeric inputs - enter values to manually specify color key limits as numeric values.Set individual color key limits
- this option is similar to manual set up of color key limits, but this allows one to specify different values for individual sub-heatmaps. When this checkbox is selected min
and max
numeric input menu is shown on the plot set grid
Set colorspace
- This input allows to use color palettes from RColorBrewer package, which would replace default color palette for heatmaps. By selecting Reverse
checkbox the reversed color palette will be used. Here are available color pallets (click here to see example heatmap plotted with different color palettes):
When the Custom
option is selected three-color pickers are shown to setup custom color mappings for heatmaps. The following example below shows standard jet colors (left), default blue color mapping after selecting the checkbox (middle) and custom color selection (right):
The heatmap output shares many display options from other tabs. Here we provide a list of these inputs, please refer to “Viewing and manipulating plots” for further reference.
X-axis label
- Label shown below horizontal axis, drawn separately for each sub-heatmap; default emptyY-axis label
- Label shown next to vertical axis, drawn separately for each sub-heatmap; default emptyLabels font size
- Font size for axis labels and main labels of sub-heatmaps; default 16 pointsAxis font size
- Controls axis ticks font size; default 14 pointsSet X-axis limits
- Set hard plotting limits for X-axis; default values are whole range chosen during plot set calculationTransform signal
- if set to Log2 transform
performs log2 transformation of the signal prior to plotting; default setting is Do not transform
Show vertical guide line
- show the vertical line at point 0 - beginning of the feature or midpoint and end of the pseudo-length scaled features (only for anchored plots); turn on by defaultColors
- for hetmaps this input allows to control the color mapping of specific sub-heatmaps. The map starts with white (for low color key limit) and finishes with selected color (for high color key limit).Label
- allows to set up custom sub-heatmap top labelsPriority/Order
- Use this to place heatmaps in your desired order. The feature-track pairs with the highest priority will be plotted as left-most sub-heatmaps.Legend font size
- control the font size of common color key, inactive if heatmaps have individual color keys; 12 defaultPlots can be downladed as PDFs by clicking Line plot
or Heatmap
buttons in the “Download:” section of the tool panel (above the plot preview).
The small buttons next to Line plot
and Heatmap
produce additional output files:
i
button next to Line plot
downloads the PDF containing average plot keyscluster diagram
button next to Heatmap
downloads a cluster report giving cluster assignments and sorting order for each feature as a comma separated value (CSV) spreadsheet.The cluster report contains following columns:
chromosome
- the name of chromosome, contig or scaffoldstart
- start of the feature (1 based chromosomal coordinate)end
- end of the feature (1 based chromosomal coordinate)width
- width of the feature in base pairsstrand
- strand of the featuremetadata_...
- annotation columns present in the original GFF/BED e.g. gene name, score, grouporiginalOrder
- number of feature (row) in GFF/BED, can be used to restore original order after sorting on cluster IDClusterID
- the numeric ID of the cluster. The topmost cluster on the heatmap is annotated with 1, and the bottom cluster with k, where k equals to number of clusters selected, exported only if clustering is enabledSortingOrder
- the order imposed on heatmap by sorting by mean row(s) values, exported only if sorting is enabledFinalOrder
- the final order of heatmap’s rows, this can be influenced by sorting and clustering; 1 indicates topmost rowSample report:
chromosome start end width strand metadata_group originalOrder ClusterID SortingOrder FinalOrder
chrI 9065087 9070286 5200 + g1 1 1 3 3
chrI 5171285 5175522 4238 - g1 2 3 50 43
chrI 9616508 9618109 1602 - g1 3 3 13 43
chrI 3608395 3611844 3450 + g1 4 3 11 12
Table view:
chromosome | start | end | width | strand | metadata_group | originalOrder | ClusterID | SortingOrder | FinalOrder |
---|---|---|---|---|---|---|---|---|---|
chrI | 9065087 | 9070286 | 5200 | + | g1 | 1 | 1 | 3 | 3 |
chrI | 5171285 | 5175522 | 4238 | - | g1 | 2 | 3 | 50 | 43 |
chrI | 9616508 | 9618109 | 1602 | - | g1 | 3 | 3 | 13 | 43 |
chrI | 3608395 | 3611844 | 3450 | + | g1 | 4 | 3 | 11 | 12 |
The last tab (Batch operation and setup
) on the tool panel includes batch operations and various other settings including PDF output size. By default the output PDF will be A4 landscape. This can be changed using the drop-down list to following settings:
user defined
- this option reveals two numeric inputs that allows to set output PDF width and height. The values must be given in inches.Legal rotated
- US Legal landscape: 14" by 8.5"A4
- A4 portrait: - 8.27" × 11.69"Letter
- US Letter portrait: 8.5" × 11"Legal
- US Legal portrait: 8.5" × 14"Executive
- a.k.a Monarch paper: 7.25 × 10.5"Controls to plot multiple plots at once are located on the Batch operation and setup
tab, just below PDF paper options. It is possible to output the plots to multipage PDF, plot an array of plots on a single page (for average plots) or mix these options together.
The first drop-down controls the type of the plot - either average or heatmap. The second drop down determines the strategy to traverse the plot grid. The options include:
single
- every single feature-track pair will be plotted on separate plotrows
- the plot grid will be traversed by rows, which means one plot that contains all tracks per feature will be preparedcolumns
- the plot grid will be traversed by columns, which means one plot that contains all features per tracks will be preparedThe multi plot grid
option controls how many plots will be placed on each page of the PDF output, e.g. 1x1 means one plot per one page, while 3x4 means 3 columns and 4 rows of plots. If number of plots exceeds the number of slots on page the new page will be added to the PDF.
Filter names
will apply a filter to plot titles, which are based on on uploaded file names. For example, if you uploaded 100 files starting with a prefix of “my_experiment_”, you can remove this fragment from each plot title and/or heatmap caption by putting this string in Filter names
.
Finally, pressing Get PDF
produces the final output file. Please see example below: