TraRe can be currently installed from Bioconductor:
if (!requireNamespace("BiocManager")) install.packages("BiocManager") BiocManager::install("TraRe")
To fully understand how TraRe works, we will go through each tool it contains:
Along with every step description and explanation, we will be working with a subset of the Stand Up To Cancer (SU2C) clinical trial dataset containing 1.2k genes (\(\approx\) 200 transcription factors or driver genes) and 121 samples from metasatic castration resistant prostate cancer patients. Patients were divided into two groups depending on the treatment: Abiraterone (ABI) and ABI/ENZA (ARSI). For the phenotype dependent Rewiring, we will be focusing on only ARSI patients (35 samples). The pathway we will follow is:
Run LINKER using the available gene expression matrix (GEM), 1149 genes and 121 samples, to generate all possible modules (GRN).
Run Rewiring test selecting the sample’s (ARSI) phenotype (35 samples) to extract LINKER modules which expression can be separated very well according to the selected phenotype. Selected modules will be scored based on a hypergeometric test.
Run Visualization, generating GRN graphs of how the regulation process may change depending on the phenotype condition. (This step is included at the end of the previous one).
Run Results, that generates an excel containing information about the driver-target relationships from the previously generated GRN graphs, and a brief summary that may be useful for biological validation of GRN. (This step is also included at the end of Rewiring).
The dataset we are working with is located at TraRe’s package folder. The complete SU2C dataset contains about 13000 genes, but we will be using this subset in order to fulfill the space requirements of the package. We consider important to use a real dataset for a proper understanding of TraRe’s results.
TraRe mainly require three types of gene-related files:
Gene expression matrix: We will be working with a log-applied and MRM-normalized gene expression matrix, lognorm_est_counts.
Gene info dataframe: As information about transcription factors is required for these methods, we will use a dataframe containing a boolean variable with this information. ‘1’s for the driver genes, ’0’s for the non-driver genes. This variable will be called ’regulator’.
Phenotype dataframe: As our GRN are phenotype dependant, this file will indicate whether samples are Responders ‘R’ or NonResponders ‘NR’.
Rewiring step performs a permutation test over a certain condition to infer if that condition is producing any deregulation on our generated GRN. Bootstrapping plays an important role, as the non-convex nature of these biological events makes necessary to ensure that a certain behavior is repeated across bootstraps, and to confirm this event does not come from a particular realization. As bootstrapping has been performed in LINKER, this step will take advantage of it and will try to group highly scored modules, to infer GRNs with similar behavior across bootstraps. It will output a folder containing :
A report containing statistical information about the module’s GRN, driver genes, target genes, pvalues, etc.
In order to run Rewiring test, Trare provides two functions: the
preparerewiring() function and the
runrewiring() function. The first one requires:
paste()to the name folder.
There are also other parameters that will remain by default. Please take
a look at the
preparerewiring() function for more information. We now generate
Note that here we will include the previously generated LINKER’s output. These files can be found in TraRe’s package folder.
# In order to prepare the rewiring, we can provide individual file paths: lognorm_est_counts_p <- paste0(system.file("extdata",package="TraRe"), '/VignetteFiles/expression_rewiring_vignette.txt') gene_info_p <- paste0(system.file("extdata",package="TraRe"), '/geneinfo_rewiring_example.txt') linker_output_p <- paste0(system.file("extdata",package="TraRe"), '/VignetteFiles/linker_output_vignette.rds') phenotype_p <- paste0(system.file("extdata",package="TraRe"), '/phenotype_rewiring_example.txt') # Or we can provide a SummarizedExperiment containing expression matrix, # gene info (as rowData) and phenotype info (colData) plus the linker output. # We are going to reload geneinfo, this time having old geneinfo rownames # as a new column, as preparerewiring will place that column into rownames, # so we dont have to use the row.names=1 in the read.delim geneinfo <- utils::read.delim(gene_info_p) # We again make sure they have same features (genes) geneinfo <- geneinfo[geneinfo$uniq_isos%in%rownames(lognorm_est_counts),] # We also load phenotype phenotype<- utils::read.delim(phenotype_p) # select only samples from the phenotype file. lognorm_est_counts <- lognorm_est_counts[,phenotype$Sample.ID] PrepareSEObject <- SummarizedExperiment::SummarizedExperiment(assays=list(counts=lognorm_est_counts), rowData = geneinfo, colData = phenotype) # This file is stored in the VignettesFiles TraRe's package folder. PrepareSEObject_p <- paste0(system.file("extdata",package="TraRe"), '/VignetteFiles/SEObject.rds') # We can now generate the prepare rewiring file: # With individual paths #prepared <- preparerewiring(name="Vignette",linker_output_p = linker_output_p, # lognorm_est_counts_p = lognorm_est_counts_p,gene_info_p = gene_info_p, # phenotype_p=phenotype_p,final_signif_thresh=0.01) # or with a SummarizedExperiment object prepared <- preparerewiring(name="Vignette",linker_output_p = linker_output_p, SEObject_p = PrepareSEObject_p,final_signif_thresh=0.01) ## Expression Matrix Size: (1149,35) ## Gene Info Table Size: (1149,2) ## NumGenes Kept: 1149 ## NumRegs and NumTargs: [70,1079] ## Phenotype Table Size: (35,2) ## ## Sample Names: [MO_1084.Tumor|PRAD.01115550.Tumor.SM.A4KNI|SC_9050.Tumor|TP_2078_Tumor ## PROS12319B.SU2C.06115116.Tumor.SM.4W2NB|MO_1410.Tumor|SC_9175_T ## PROS01448.6115251.Tumor.SM.67ES6|SC_9057.Tumor|MO_1241.Tumor ## PROS01448.1115161.Tumor.SM.5SGU1|TP_2060.TM|SC_9183_T ## MO_1336.TM|TP_2081_T|PRAD.01115549.Tumor.SM.A4KNH ## MO_1176.Tumor|MO_1179.Tumor|MO_1510_T ## PRAD.01115468.Tumor.SM.6B2KE|SC_9129_Tumor|PM158.TM ## SC_9137_Tumor|PRAD.6115594.0.Tumor.SM.B2XRW|PROS01448.6115234.Tumor.SM.67ERX ## SC_9083.TM|PRAD.6115590.0.Tumor.SM.B2XRS|SC_9080.TM ## PROS01448.6115235.Tumor.SM.67ERY|PRAD.06115124.Tumor.SM.7LGU1|MO_1184.Tumor ## MO_1192.Tumor|PROS01448.6115247.Tumor.SM.67ES2|SC_9043.Tumor|PM14.TM] ## ## Number of samples: 35 ## Filtered exp matrix: (1149,35) ## Class Per Counts: (20,15) # NOTE: The reason why we are working with paths instead of files is that # rewiring is allowed to compare more than 1 dataset, so it is easier # to provide a vector of paths than a complex structure of files. # This is, we can provide , for instance, a vector of 2 SummarizedExperiments # that will be compared. SEObject_p <- c(SEObject1,SEObject2).. # linker_output_p <- c(linker_output_p_1,linker_output_p_2)
In order to run
runrewiring(), we just call
runrewiring(prepared). It will create a folder on the specified output path with an html report containing the hierarchical clustering of the rewired modules, these in the form of a heatmap and a report containing statistical information about the performed test.
As shown in the heatmap, there a is a clear Super Module (SM). For a SM to exist
there are some requirements that have to be fulfilled. First, these modules should contain genes whose expression can be separated very well using the ARSI phenotype.
Second, they have shown similar behavior across bootstraps, with a score
given by its hypergeometrical test. Therefore, there is a SM and its formed by
In the figure above the Hierarchical Clustering applied shows how the SM we are talking about has been selected. The current implementation of Rewiring method pulls away the first SM it finds and generates all the graph objects and figures we are showing in the next section.
This tool provides a graphical way of detecting condition-dependent GRN deregulation on the selected rewired modules. Once we have selected a cluster of modules which behave similar across bootstraps, we can constraint them to form a single GRN using
NET_Run() and plot these GRN, filtering by samples we want to evaluate according to the phenotype condition.
We provide two ways of building the layout for the plot, depending on the choice of a t.test to be evaluated over the generated GRN, to sort target genes prior to plot the established relationship between these and drivers. On the one hand,
return_layout() generates a regular layout in which there is no t.test, and target genes are sorted randomly in a line. On the other hand,
return_layout_phenotype() performs a target gene level t.test which has as null hypothesis if samples separated by the selected condition are not deferentially expressed. From this analysis, the z-score is used to sort target genes and plot them describing a curve when using
# Assume we have run the rewiring method and the `NET_run()` method to generate the # igraph object. We are going to generate and plot both layouts for the example. # We are going to generate all the files we need except for the igraph object, which # is included as an example file in this package. # We load the igraph object that we generated from the `NET_run()` example. # Note: the igraph object is inside the list `NET_run()` generates. graph <- readRDS(paste0(system.file("extdata",package="TraRe"), '/VignetteFiles/graph_netrun_vignette.rds')) graph <- graph$graphs$VBSR # We first generate the normal layout for the plot. # We need the drivers and target names. lognorm_est_counts <- utils::read.delim(paste0(system.file("extdata",package="TraRe"), '/VignetteFiles/expression_rewiring_vignette.txt')) drivers <- lognorm_est_counts[seq(70),] targets <- lognorm_est_counts[70+seq(1079),] # Note that the generated graph may not have the same drivers and targets we used # for generating it, so we will extract those genes and check in the gene_info file # if they are drivers or targets. geneinfo <- utils::read.delim(paste0(system.file("extdata",package="TraRe"), '/geneinfo_rewiring_example.txt')) R<-intersect(geneinfo[geneinfo$regulator==1,1],names(igraph::V(graph))) P<-intersect(geneinfo[geneinfo$regulator==0,1],names(igraph::V(graph))) drivers_n <- rownames(drivers[R,]) targets_n <- rownames(targets[P,]) # As for this example we are working at gene level (we do not have transcripts inside genes), # we wont need namehash parameter (see param `namehash`) normal_layout <- return_layout(drivers_n,targets_n) # We need to separate our expression matrix by a binary phenotype. # This is what the clinical file is used for. gnames <- c(drivers_n,targets_n) expmat <-rbind(drivers,targets) clinic <- utils::read.delim(paste0(system.file("extdata",package="TraRe"), '/phenotype_rewiring_example.txt')) expmat_R <- expmat[,clinic$Sample.ID[clinic$Class=='R']] expmat_NR <- expmat[,clinic$Sample.ID[clinic$Class=='NR']] # We now generate the phenotype layout and the `varfile` we need for this layout. # (I leave here a way to generate) varfile <- t(as.matrix(sapply(gnames, function(x) c(stats::t.test(expmat_R[x,],expmat_NR[x,])$statistic, if(x%in%drivers_n) 1 else 0)))) colnames(varfile)<-c("t-stat","is-regulator") phenotype_layout <- return_layout_phenotype(drivers_n,targets_n,varfile) plot_igraph(graph,mytitle="Normal Layout",titlecol="black",mylayout=normal_layout)