After performing the differential network analysis, you
can explore several gene set properties on
the "Further analysis" tab.
Below, we explain the available tools for exploring the
gene sets.
Gene set selection
To analyze a gene set, select it from the list of gene sets
on the "Further analysis" tab.
The following options are available for filtering
the lists of gene sets:
-
All gene sets with p-values less than the threshold: filter the gene
sets with the network comparison test nominal p-value less
than the selected threshold. This option
is visible only after the differential network analysis finishes.
-
All gene sets with q-values less than the threshold: filter the gene
sets with the differential network test adjusted p-value
by False Discovery Rate method (Benjamini and Hochberg, 1995) for multiple testing less than the selected threshold.
This option
is visible only after differential network analysis finishes
-
All tested gene sets: the gene sets that were tested for
the differential network analysis. This option
is visible only after differential network analysis finishes.
-
All filtered gene sets: the gene sets that were filtered
according to their size on the left side panel. This option
is visible only if differential network analysis has not
been performed yet.
-
All loaded gene sets: all gene sets of the
uploaded gene set collection.
After selecting a gene set, choose a tab on the "Further analysis" section
for analyzing the selected set.
Network visualization plots
Given a gene set name, the "Network visualization plots" tab on the "Further analysis"
section provides tools to visually inspect the networks and the differences between
the two phenotypes.
Below, we describe the available tools.
Plot settings
- Colors selection: select a color scheme for the plots
- Plot format: select a file format to save the plots
- Plot dimensions: set the plot dimensions in inches (for PDF format)
or pixels (PNG and JPG formats)
Network visualization
The network visualization tool plots one association matrix for
each phenotype.
The association matrix contains the association
degree of each pair of genes from the selected gene set. The
associations degrees are measured from the expression data
using the method selected on the left sidebar
("Method for network inference"). The degrees vary between 0 and 1, and
are represented by colors.
If the "absolute correlation" option is selected on the sidebar, the user can visualize both absolute and non absolute
correlations by setting the "Show negative correlations"
option. To save the plots, click on the "Save class 1 network plot"
and "Save class 2 network plot" buttons.
You can check out the value of each association degree
on the "Association between two gene products"
section.
Differences between the gene networks
To visualize the differences between the networks,
you can plot a matrix of the differences.
Each position of the matrix shows the difference
in the correlation (association degree) of two gene products between
the two phenotypes.
You can choose one of the following options:
-
[class1] - [class2]: class1 association degree matrix
minus class2 association degree matrix.
-
[class2] - [class1]: class2 association degree matrix
minus class1 association degree matrix.
-
Absolute differences between the association degrees:
matrix of the absolute values of the differences
between the association degrees.
To plot the selected matrix of differences, just
expand the "Matrix of differences" collapsing panel. You can save
the plot by clicking on the "Save plot button".
To see the differences in association degree of
each pair of genes,
expand the "List of gene association degrees" collapsing panel.
You can save the list as a CSV or R data file:
- CSV: a comma-separated text file, containing the
association degrees between the gene products.
It can be opened with a spreadsheet software
- R data: a file that contains a data frame R variable called
associationDegrees. It
can be loaded using the "load" command in the R console.
Gene set properties
To check out network features of each phenotype, select one
of the options described below.
Network features for unweighted graphs:
-
Spectral entropy: measures the
graph topological organization complexity (Takahashi et al., 2012).
-
Average degree centrality: The degree of a node is the number of edges
that connect to it. The average degree centrality is the sum of all
node degrees divided by the number of vertices.
-
Average betweenness centrality: The betweenness centrality of a node is the number of shortest paths going through it (Freeman, 1979). The average
betweenness centrality is the sum of all
node betweenness centralities divided by the number of vertices.
-
Average closeness centrality: The closeness centrality of a node is the
inverse of the average length of the shortest paths between it and all
the other vertices in the graph (Freeman, 1979).
The average closeness centrality is the sum of all
node closeness centralities divided by the number of vertices.
-
Average eigenvector centrality: The eigenvector centrality of a node
vi is the ith value of the first eigenvector
of the graph adjacency matrix (Bonacich, 1987).
The average eigenvector centrality is the sum of all
node eigenvector centralities divided by the number of vertices.
-
Average clustering coefficient: The local clustering coefficient of a node is the number of edges between the
vertices within its neighborhood divided by the number of edges that could
exist among them (Watts and Strogatz, 1998).
The average clustering coefficient is the sum of all
node local clustering coefficients divided by the number of vertices.
-
Average shortest path length:
average of all the shortest path
lengths for all pair of nodes vi and
vj with i ≠ j.
Network features for weighted graphs:
-
Spectral entropy: Replaces the usual adjacency matrix by the
weighted adjacency matrix, and then computes the spectral
entropy for unweighted networks.
-
Average degree centrality: CoGA generalizes the degree of a node
to the sum of the weights of the edges that connect to it (Barrat, 2004).
It replaces the usual node degree by the weighted degree, and
then computes the average degree centrality for weighted networks.
-
Average eigenvector centrality: replaces the usual adjacency matrix by the
weighted adjacency matrix, and then computes the average eigenvector centrality
for unweighted networks (Newton, 2004).
-
Average clustering coefficient: replaces the local clustering coefficient
of a node by the sum of the weights of the edges between the vertices within its neighborhood divided by the number of edges that could exist among
them (Lopez-Fernandez et al, 2004). Then it computes the
average clustering coefficient for unweighted networks.
Gene scores
To rank the genes according to their ``importance'' in the networks,
select one of the options below.
Gene scores for unweighted graphs:
-
Degree centrality: The degree of a node is the number of edges
that connect to it.
-
Betweenness centrality: The betweenness centrality of a node is the number of shortest paths going through it (Freeman, 1979).
-
Closeness centrality: The closeness centrality of a node is the
inverse of the average length of the shortest paths between it and all
the other vertices in the graph (Freeman, 1979).
-
Eigenvector centrality: The eigenvector centrality of a node
vi is the ith value of the first eigenvector
of the graph adjacency matrix (Bonacich, 1987).
-
Local clustering coefficient: The local clustering coefficient of a node is the number of edges between the
vertices within its neighborhood divided by the number of edges that could
exist among them (Watts and Strogatz, 1998).
Gene scores for weighted graphs:
-
Degree centrality: CoGA generalizes the degree of a node
to the sum of the weights of the edges that connect to it (Barrat, 2004).
-
Eigenvector centrality: replaces the usual adjacency matrix by the
weighted adjacency matrix, and then computes the eigenvector centrality
for unweighted networks (Newton, 2004).
-
Local clustering coefficient: generalizes the local clustering coefficient
of a node to the sum of the weights of the edges between the vertices within its neighborhood divided by the number of edges that could exist among
them (Lopez-Fernandez et al, 2004).
You can save the gene scores as a CSV or R data file:
- CSV: a comma-separated text file, containing the gene scores. It can be opened with a spreadsheet software
- R data: a file that contains a data frame R variable called
geneScores. It
can be loaded by using the "load" command in the R console.
Gene expression analysis
Below, we describe the available single gene differential analysis tools.
Gene expression heatmap
The gene expression heatmap represents the gene expression levels by
colors. Each column corresponds to a gene of the selected gene set,
and each row represents one sample.
You can select the colors of the heatmap, and set its
clustering options. The rows or columns of the heatmap
will be clustered according to the enclidean distance.
You can set the heatmap dimensions and save it as
a PDF, PNG or JPG file.
CoGA uses the pheatmap CRAN package to plot the heatmaps.
Tests for differential expression
CoGA tests the difference in average or median expression levels
of each single gene of the selected gene set. Those analyses use the "t.test" and "wilcox.test" R functions.
The figure below shows the table of results:
Those results can be saved as a CSV or R data file:
- CSV: a comma-separated text file, containing the differential
analysis statistics and p-values. It can be opened with a spreadsheet software
- R data: a file that contains a data frame R variable called
"diffExpressionAnalysis". It
can be loaded by using the "load" command in the R console.
Gene expression boxplot
To visualize the distribution of the gene expression levels
in each phenotype, just select a gene from the list on the
"Gene expression boxplot" section.
You can set the plot dimensions and save it as a PDF, PNG or
JPG file.
CoGA boxplots are built with the ggplot2 package.