Contents

1 Overview

Recurrent patterns in biological networks have been termed network motifs (Bracken, Scott, and Goodall 2016), which may reflect critical roles in multiple biological processes; for example, regulatory loops between transcription factors and microRNAs (Zhang et al. 2015). RTNduals searches for targets shared between pairs of regulators, using regulatory networks generated by the RTN package (for details, please refer to the RTN documentation) (Castro et al. 2016). In such a network, each regulator has an associated set of gene targets (i.e. a regulon), and when we assess the shared targets in the regulons of a pair of regulators, we find that each shared target may be regulated in a positive or negative direction by a given regulator (i.e. pairs of regulators can either agree or disagree on the predicted downstream effects for a shared target gene). ‘Dual regulons’ represent regulon pairs whose common targets are likely to be affected by both regulators. The inference of ‘dual regulons’ requires three complementary statistics: (1) Targets are assigned to regulons based on MI between the regulator and the target. The significance of the MI statistics is assessed by permutation and bootstrap analysis. The association between pairs of regulators is also identified in this step, since regulators can target each other. (2) Shared targets between any two regulons are identified and the similarity in regulation (i.e. positive or negative direction) is assessed by correlation analysis. Single network motifs are identified in this step consisting of two regulators and one common target. (3) A test is carried out to determine if the correlation between the set of network motifs of any two regulons is higher than would be expected by chance. The schematics in Figure 1 show examples of ‘dual regulons’ with two network motifs each. In (a) the two regulators agree on the downstream effects (i.e. same directions), while in (b) they disagree (i.e. opposite directions). Our method can be applied to any regulatory relationship. For gene expression data sets typical regulators might include transcription factors, miRNAs, eRNAs and lncRNAs.

title Figure 1 - Examples of regulators and predicted associations. This figure illustrates two ‘dual regulons’ and the respective downstream effects. (a) Regulators R1 and R2 agree in both activation (open circle) and repression (filled circle). (b) Regulators R1 and R2 disagree.

2 Quick Start

The RTNduals workflow starts with a preprocessing step that generates an MBR-class (Motifs Between Regulons) object from an expression matrix and two lists of regulators. The expression matrix is typically obtained from multiple samples (e.g. transcriptomes from a cancer cohort), while the lists of regulators represent some prior biological information indicating which genes in the expression matrix should be regarded as regulators. The input data can also deal with two classes of regulators; for example, genes and microRNAs. In this case, the expression matrix should comprise mRNA and miRNA expression values. Alternatively, the MBR-class object can be obtained by combining two TNI-class objects pre-computed in the RTN package.

2.1 Load datasets

This example provides the data required to generate an MBR-class object. The dataset dt4rtn is available from the RTN package and consists of an R list with 6 objects, 3 of which will be used in the subsequent analysis: (1) gexp, a named gene expression matrix with 250 samples (genes in rows, samples in cols), (2) gexpIDs, a data.frame with Probe-to-ENTREZ annotation, and (3) tfs, a named vector listing transcription factors (in this case, 149 TFs). While these datasets were extracted, pre-processed and size-reduced from Fletcher et al. (2013) and Curtis et al. (2012), they should be regarded as examples for demonstration purposes only.

##--- load package and prepare a dataset for demonstration
library(RTNduals)
data("dt4rtn", package = "RTN")
gexp <- dt4rtn$gexp
annot <- dt4rtn$gexpIDs
tfs1 <- dt4rtn$tfs[c("IRF8","IRF1","PRDM1","AFF3","E2F3")]
tfs2 <- dt4rtn$tfs[c("HCLS1","STAT4","STAT1","LMO4","ZNF552")]

2.2 Preprocessing

The gexp data matrix and the corresponding annotation are evaluated by the mbrPreprocess function in order to check the consistency of the input data. After this step it is generated a pre-processed MBR-class object whose status is updated to ‘Preprocess [x]’.

##--- generate a pre-processed BR-class object
rmbr <- mbrPreprocess(gexp=gexp, regulatoryElements1=tfs1, regulatoryElements2=tfs2, gexpIDs=annot, verbose=FALSE)
rmbr
## an MBR (Motifs Between Regulons) object:
## --status:
##  Preprocess Permutation   Bootstrap  DPI.filter Association 
##         [x]         [ ]         [ ]         [ ]         [ ]

2.3 Run permutation analysis

The mbrPermutation method inherits the same algorithm implemented in the RTN package. This function takes the pre-processed MBR-class object and returns two regulatory networks that are inferred by mutual information analysis (with multiple hypothesis testing corrections). The results are included in the ‘TNI’ slots, which will be used in the subsequent steps of the pipeline.

##--- compute two regulatory networks
##--- (set nPermutations>=1000)
rmbr <- mbrPermutation(rmbr, nPermutations=100, verbose=FALSE)
rmbr
## an MBR (Motifs Between Regulons) object:
## --status:
##  Preprocess Permutation   Bootstrap  DPI.filter Association 
##         [x]         [x]         [ ]         [ ]         [ ]

2.4 Run bootstrap analysis

In additional to the permutation analysis, the stability of the regulatory networks is assessed by bootstrapping using the mbrBootstrap function, which also inherits the same algorithm from the RTN package. Each ‘TNI’ slot of the MBR-class object is updated with a consensus bootstrap network.

##--- check the stability of the two regulatory networks
##--- (set nBootstrap>=100)
rmbr <- mbrBootstrap(rmbr, nBootstrap=10, verbose=FALSE)
rmbr
## an MBR (Motifs Between Regulons) object:
## --status:
##  Preprocess Permutation   Bootstrap  DPI.filter Association 
##         [x]         [x]         [x]         [ ]         [ ]

2.5 Run DPI filter

In a given regulatory network each target can be linked to multiple regulators as a result of both direct and indirect interactions. The Data Processing Inequality (DPI) algorithm (P. Meyer, Lafitte, and Bontempi 2008) is used to remove the weakest interaction between two regulators and a common target. This step inherits the algorithm that is implemented in the RTN package, and it is optional for the analyses described in this workflow (the results of this step can be used to assess regulon activity in the RTN package).

##---apply DPI algorithm
rmbr <- mbrDpiFilter(rmbr, eps=0.05, verbose=FALSE)
rmbr
## an MBR (Motifs Between Regulons) object:
## --status:
##  Preprocess Permutation   Bootstrap  DPI.filter Association 
##         [x]         [x]         [x]         [x]         [ ]

2.6 Run association analysis between regulons

The mbrAssociation method takes the two transcriptional networks computed in the previous steps and enumerates all motifs between all regulons. The method retrieves the mutual information between regulators and assesses the agreement between the predicted downstream effects using correlation analysis. A hypergeometric test is used to evaluate whether the common targets are potentially affected by both regulators in a number greater than expected by chance. Note that this example warns the user that only a few regulon pairs are being tested. This warning is to recall that the search space should ideally represent all possible combinations of a given class of regulators (for example, all nuclear receptors annotated for a given species).

##--- run the main RTNduals methods
rmbr <- mbrAssociation(rmbr, prob=0.75, verbose=FALSE)
## Warning: Only 25 regulon pair(s) is(are) being tested!
## Ideally, the search space should represent all possible
## combinations of a given class of regulators! For example,
## all nuclear receptors annotated for a given species.

A summary of the results can be accessed from the ‘summary’ slot using the mbrGet function.

##--- check summary
mbrGet(rmbr, what="summary")
## $MBR
## $MBR$Duals
##       testedDuals inferredDuals
## duals          25             7
## 
## 
## $TNIs
## $TNIs$TNI1
##          RE Targets Edges
## tnet.ref  5    2394  5330
## tnet.dpi  5    2394  4687
## 
## $TNIs$TNI2
##          RE Targets Edges
## tnet.ref  5    2137  5225
## tnet.dpi  5    2137  4469

2.7 Rank ‘dual regulons’

The mbrDuals method ranks all candidates using the correlation values computed in the mbrAssociation step, and returns the list of ‘dual regulons’.

##--- run 'mbrDuals' and get results
rmbr <- mbrDuals(rmbr)
## -Sorting by the R value...
results <- mbrGet(rmbr, what="dualsInformation")

Also, when prior evidences are available this method can add a ‘supplementaryTable’ regarding the association between regulators. The ‘supplementaryTable’ is a ‘data.frame’ listing unique relationships between any two regulators used to compute the ‘dual regulons’ (please refer to the documentation for details on the input data format).

##--- here we build a 'toy' evidence table using the 'rnorm' function
supplementaryTable <- results[ ,c("Regulon1","Regulon2")]
supplementaryTable$ToyEvidence <- rnorm(nrow(results))
supplementaryTable
##             Regulon1 Regulon2 ToyEvidence
## IRF8~HCLS1      IRF8    HCLS1 -0.24545739
## IRF8~STAT4      IRF8    STAT4 -1.42379017
## IRF1~HCLS1      IRF1    HCLS1  0.13421269
## IRF1~STAT1      IRF1    STAT1 -1.58222863
## IRF1~STAT4      IRF1    STAT4  1.53651854
## PRDM1~HCLS1    PRDM1    HCLS1  0.02745146
## PRDM1~STAT4    PRDM1    STAT4 -1.57271790
##--- add supplementary evidences with the 'mbrDuals' function
rmbr <- mbrDuals(rmbr, supplementaryTable = supplementaryTable, 
                  evidenceColname = "ToyEvidence", verbose = FALSE)
##--- check updated results
mbrGet(rmbr, what="dualsInformation")
##             Regulon1 Size.Regulon1 Regulon2 Size.Regulon2
## IRF8~HCLS1      IRF8          1005    HCLS1           956
## IRF8~STAT4      IRF8          1005    STAT4          1053
## IRF1~HCLS1      IRF1           719    HCLS1           956
## IRF1~STAT1      IRF1           719    STAT1          1002
## IRF1~STAT4      IRF1           719    STAT4          1053
## PRDM1~HCLS1    PRDM1          1072    HCLS1           956
## PRDM1~STAT4    PRDM1          1072    STAT4          1053
##             Jaccard.coefficient Hypergeometric.Pvalue
## IRF8~HCLS1            0.7262324                     0
## IRF8~STAT4            0.6516854                     0
## IRF1~HCLS1            0.5552461                     0
## IRF1~STAT1            0.5421147                     0
## IRF1~STAT4            0.5302245                     0
## PRDM1~HCLS1           0.5528331                     0
## PRDM1~STAT4           0.5276779                     0
##             Hypergeometric.Adjusted.Pvalue        MI MI.Adjusted.Pvalue
## IRF8~HCLS1                               0 0.8481564              <0.01
## IRF8~STAT4                               0 0.6460023              <0.01
## IRF1~HCLS1                               0 0.3498595              <0.01
## IRF1~STAT1                               0 0.3989572              <0.01
## IRF1~STAT4                               0 0.3426818              <0.01
## PRDM1~HCLS1                              0 0.2805251              <0.01
## PRDM1~STAT4                              0 0.2243653              <0.01
##                     R Quantile ToyEvidence
## IRF8~HCLS1  0.8825243     1.00 -0.24545739
## IRF8~STAT4  0.8422128     0.96 -1.42379017
## IRF1~HCLS1  0.7716681     0.92  0.13421269
## IRF1~STAT1  0.7688522     0.88 -1.58222863
## IRF1~STAT4  0.7609390     0.84  1.53651854
## PRDM1~HCLS1 0.7593585     0.80  0.02745146
## PRDM1~STAT4 0.7472304     0.76 -1.57271790

2.8 Visualize shared cloud of targets

The package provides the mbrPlotDuals function to visualize ‘dual regulon’ and the respective motifs, which represent the shared cloud of targets. Figure 2 shows two examples of ‘dual regulons’ for the schematics illustrated in Figure 1. In (a) the two regulators agree on the downstream effect (i.e. same directions), while in (b) they disagree (i.e. opposite directions).

duals <- mbrGet(rmbr, what="dualRegulons")
mbrPlotDuals(rmbr, names.duals = duals[1])

title Figure 2. The ‘mbrPlotDuals’ function shows the shared cloud of targets. In (a) the regulators agree on the downstream effects, while in (b) they disagree. The colour pattern follows the schematics in Figure 1.

3 Session Info

sessionInfo()
## R version 3.4.0 (2017-04-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.2 LTS
## 
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.5-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.5-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] RTNduals_1.0.3  RTN_1.14.0      BiocStyle_2.4.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.10        compiler_3.4.0      nloptr_1.0.4       
##  [4] tools_3.4.0         minet_3.34.0        digest_0.6.12      
##  [7] lme4_1.1-13         evaluate_0.10       nlme_3.1-131       
## [10] lattice_0.20-35     mgcv_1.8-17         Matrix_1.2-10      
## [13] igraph_1.0.1        yaml_2.1.14         parallel_3.4.0     
## [16] SparseM_1.77        stringr_1.2.0       knitr_1.15.1       
## [19] MatrixModels_0.4-1  S4Vectors_0.14.0    IRanges_2.10.0     
## [22] stats4_3.4.0        rprojroot_1.2       nnet_7.3-12        
## [25] grid_3.4.0          data.table_1.10.4   snow_0.4-2         
## [28] rmarkdown_1.5       limma_3.32.2        minqa_1.2.4        
## [31] RedeR_1.24.1        car_2.1-4           magrittr_1.5       
## [34] backports_1.0.5     htmltools_0.3.6     BiocGenerics_0.22.0
## [37] MASS_7.3-47         splines_3.4.0       pbkrtest_0.4-7     
## [40] quantreg_5.33       stringi_1.1.5

References

Bracken, Cameron P., Hamish S. Scott, and Gregory J. Goodall. 2016. “A Network-Biology Perspective of MicroRNA Function and Dysfunction in Cancer.” Nat Rev Genet 17 (12). Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.: 719–32. http://dx.doi.org/10.1038/nrg.2016.134.

Castro, Mauro, Ines de Santiago, Thomas Campbell, Courtney Vaughn, Theresa Hickey, Edith Ross, Wayne Tilley, Florian Markowetz, Bruce Ponder, and Kerstin Meyer. 2016. “Regulators of Genetic Risk of Breast Cancer Identified by Integrative Network Analysis.” Nature Genetics 48 (1): 12–21. doi:10.1038/ng.3458.

Curtis, Christina, Sohrab P. Shah, Suet-Feung Chin, Gulisa Turashvili, Oscar M. Rueda, Mark J. Dunning, Doug Speed, et al. 2012. “The Genomic and Transcriptomic Architecture of 2,000 Breast Tumours Reveals Novel Subgroups.” Nature 486: 346–52. doi:10.1038/nature10983.

Fletcher, Michael, Mauro Castro, Suet-Feung Chin, Oscar Rueda, Xin Wang, Carlos Caldas, Bruce Ponder, Florian Markowetz, and Kerstin Meyer. 2013. “Master Regulators of FGFR2 Signalling and Breast Cancer Risk.” Nature Communications 4: 2464. doi:10.1038/ncomms3464.

Meyer, Patrick, Frederic Lafitte, and Gianluca Bontempi. 2008. “Minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information.” BMC Bioinformatics 9 (1): 461. doi:10.1186/1471-2105-9-461.

Zhang, Hong-Mei, Shuzhen Kuang, Xushen Xiong, Tianliuyun Gao, Chenglin Liu, and An-Yuan Guo. 2015. “Transcription Factor and MicroRNA Co-Regulatory Loops: Important Regulatory Motifs in Biological Processes and Diseases.” Briefings in Bioinformatics 16 (1): 45–58. doi:10.1093/bib/bbt085.