Abstract
Introduction
Moonlight’s pipeline
Moonlight’s proposed workflow
Installation
- Citation
Download: Get TCGA data
- getDataTCGA: Search by cancer type and data type [Gene Expression]
- getDataTCGA: Search by cancer type and data type [Methylation]
Download: Get GEO data
- getDataGEO: Search by cancer type and data type [Gene Expression]
Analysis: To analyze TCGA data
TCGA Downstream Analysis: Case Studies
References

Abstract

In order to make light of cancer development, it is crucial to understand which genes play a role in the mechanisms linked to this disease and moreover which role that is. Commonly biological processes such as proliferation and apoptosis have been linked to cancer progression. Based on expression data we perform functional enrichment analysis, infer gene regulatory networks and upstream regulator analysis to score the importance of well-known biological processes with respect to the studied cancer. We then use these scores to predict two specific roles: genes that act as tumor suppressor genes (TSGs) and genes that act as oncogenes (OCGs). This methodology not only allows us to identify genes with dual role (TSG in one cancer type and OCG in another) but also to elucidate the underlying biological processes.

Introduction

Cancer development is influenced by mutations in two distinctly different categories of genes, known as tumor suppressor genes (TSG) and oncogenes (OCG). The occurrence of mutations in genes of the first category leads to faster cell proliferation while mutations in genes of second category increases or changes their function. We propose MoonlightR a new approach to define TSGs and OCGs based on functional enrichment analysis, infer gene regulatory networks and upstream regulator analysis to score the importance of well-known biological processes with respect to the studied cancer.

Moonlight’s pipeline

Moonlight’s pipeline is shown below:

Moonlight’s proposed workflow

The proposed pipeline consists of following eight steps:

getDataTCGA & getDataGEO for Data collection: expression levels of genes in all samples obtained with IlluminaHiSeq RNASeqV2 in 18 normal tissues (NT) and 18 cancer tissues (CT) according to TCGA criteria, and GEO data set matched to one of the 18 given TCGA cancer types as described in following Table TCGA / GEO.
DPA Differential Phenotype Analysis (DEA) to identify genes or probes that are different significantly with two phenotypes such as normal and tumor, or normal and stageI, normal and molecular subtype.
FEA Functional Enrichment Analysis (EA), using Fisher’s test, to identify gene sets (with biological functions linked to cancer¹) significantly enriched by RG.
GRN Gene regulatory network inferred between each single DEG (sDEG) and all genes by means of mutual information, obtaining for each DEG a list of regulated genes (RG).
URA Upstream Regulator Analysis for DEGs in each enriched gene set, we applied z-score being the ratio between the sum of all predicted effects for all the gene involved in the specific function and the square-root of the number of all genes.
PRA Pattern recognition analysis identifies candidate TCGs (down) and OCGs (up). We either use user defined biological processes or random forests.
We applied the above procedure to multiple cancer types to obtain cancer-specific lists of TCGs and OCGs. We compared the lists for each cancer: if a sDEG was TSG in a cancer and OCG in another we defined it as dual-role TSG-OCG. Otherwise if we found a sDEG defined as OCG or TSG only in one tissue we defined it tissue specific biomarker.
We use the COSMIC database to define a list of gold standard TSG and OCGs to assess the accuracy of the proposed method.

¹ For the devel version of MoonlightR we use a short extract of ten biological functions from QIAGEN’S Ingenuity Pathway Analysis (IPA). We are still working to integrate the package.

Installation

To install use the code below.

source("https://bioconductor.org/biocLite.R")
biocLite("MoonlightR")

Citation

Please cite TCGAbiolinks package:

“TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data.” Nucleic acids research (2015): gkv1507. (Colaprico, Antonio and Silva, Tiago C. and Olsen, Catharina and Garofano, Luciano and Cava, Claudia and Garolini, Davide and Sabedot, Thais S. and Malta, Tathiane M. and Pagnotta, Stefano M. and Castiglioni, Isabella and Ceccarelli, Michele and Bontempi, Gianluca and Noushmehr, Houtan 2016)

Related publications to this package:

“TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages”. F1000Research 10.12688/f1000research.8923.1 (Silva, TC and Colaprico, A and Olsen, C and D’Angelo, F and Bontempi, G and Ceccarelli, M and Noushmehr, H 2016)

`Download`: Get TCGA data

You can search TCGA data using the getDataTCGA function.

`getDataTCGA`: Search by cancer type and data type [Gene Expression]

The user can query and download the cancer types supported by TCGA, using the function getDataTCGA: In this example we used LUAD gene expression data with only 4 samples to reduce time downloading.

dataFilt <- getDataTCGA(cancerType = "LUAD", 
                          dataType = "Gene expression",
                          directory = "data",
                          nSample = 4)

`getDataTCGA`: Search by cancer type and data type [Methylation]

The user can also query and download methylation data using the function getDataTCGA:

dataFilt <- getDataTCGA(cancerType = "BRCA", 
                          dataType = "Methylation", 
                          directory = "data",nSample = 4)

`Download`: Get GEO data

You can search GEO data using the getDataGEO function.

GEO_TCGAtab: a 18x12 matrix that provides the GEO data set we matched to one of the 18 given TCGA cancer types

knitr::kable(GEO_TCGAtab, digits = 2, 
             caption = "Table with GEO data set matched to one 
             of the 18 given TCGA cancer types ",
             row.names = TRUE)

Table 1: Table with GEO data set matched to one of the 18 given TCGA cancer types
	Cancer	TP	NT	DEG.	Dataset	TP.1	NT.1	Platform	DEG..	Common	GEO_Normal	GEO_Tumor
1	BLCA	408	19	2937	GSE13507	165	10	GPL65000	2099	896	control	cancer
2	BRCA	1097	114	3390	GSE39004	61	47	GPL6244	2449	1248	normal	Tumor
3	CHOL	36	9	5015	GSE26566	104	59	GPL6104	3983	2587	Surrounding	Tumor
4	COAD	286	41	3788	GSE41657	25	12	GPL6480	3523	1367	N	A
5	ESCA	184	11	2525	GSE20347	17	17	GPL571	1316	406	normal	carcinoma
6	GBM	156	5	4828	GSE50161	34	13	GPL570	4504	2660	normal	GBM
7	HNSC	520	44	2973	GSE6631	22	22	GPL8300	142	129	normal	cancer
8	KICH	66	25	4355	GSE15641	6	23	GPL96	1789	680	normal	chromophobe
9	KIRC	533	72	3618	GSE15641	32	23	GPL96	2911	939	normal	clear cell RCC
10	KIRP	290	32	3748	GSE15641	11	23	GPL96	2020	756	normal	papillary RCC
11	LIHC	371	50	3043	GSE45267	46	41	GPL570	1583	860	normal liver	HCC sample
12	LUAD	515	59	3498	GSE10072	58	49	GPL96	666	555	normal	tumor
13	LUSC	503	51	4984	GSE33479	14	27	GPL6480	3729	1706	normal	squamous cell carcinoma
14	PRAD	497	52	1860	GSE6919	81	90	GPL8300	246	149	normal prostate	tumor samples
15	READ	94	10	3628	GSE20842	65	65	GPL4133	2172	1261	M	T
16	STAD	415	35	2622	GSE2685	10	10	GPL80	487	164	N	T
17	THCA	505	59	1994	GSE33630	60	45	GPL570	1451	781	N	T
18	UCEC	176	24	4183	GSE17025			GPL570			tp	lcm

`getDataGEO`: Search by cancer type and data type [Gene Expression]

The user can query and download the cancer types supported by GEO, using the function getDataGEO:

dataFilt <- getDataGEO(GEOobject = "GSE20347",platform = "GPL571")

dataFilt <- getDataGEO(TCGAtumor = "ESCA")

`Analysis`: To analyze TCGA data

`DPA`: Differential Phenotype Analysis

Differential phenotype analysis is able to identify genes or probes that are significantly different between two phenotypes such as normal vs. tumor, or normal vs. stageI, normal vs. molecular subtype.

For gene expression data, DPA is running a differential expression analysis (DEA) to identify differentially expressed genes (DEGs) using the TCGAanalyze_DEA function from .

For methylation data, DPA is running a differentially methylated regions analysis (DMR) to identify differentially methylated CpG sites using the TCGAanalyze_DMR the TCGAanalyze_DMR function from .

dataDEGs <- DPA(dataFilt = dataFilt,
                dataType = "Gene expression")

For gene expression data, DPA dealing with GEO data is running a differential expression analysis (DEA) to identify differentially expressed genes (DEGs) using to the eBayes and topTable functions from .

data(GEO_TCGAtab)
DataAnalysisGEO<- "../GEO_dataset/"
i<-5

cancer <- GEO_TCGAtab$Cancer[i]
cancerGEO <- GEO_TCGAtab$Dataset[i]
cancerPLT <-GEO_TCGAtab$Platform[i]
fileCancerGEO <- paste0(cancer,"_GEO_",cancerGEO,"_",cancerPLT, ".RData")

dataFilt <- getDataGEO(TCGAtumor = cancer)

GEOdegs <- DPA(dataConsortium = "GEO",
               gset = dataFilt ,
               colDescription = "title",
               samplesType  = c(GEO_TCGAtab$GEO_Normal[i],
                                GEO_TCGAtab$GEO_Tumor[i]),
               fdr.cut = 0.01,
               logFC.cut = 1,
               gsetFile = paste0(DataAnalysisGEO,fileCancerGEO))

We can visualize those differentially expressed genes (DEGs) with a volcano plot using the TCGAVisualize_volcano function from .

library(TCGAbiolinks)
TCGAVisualize_volcano(DEGsmatrix$logFC, DEGsmatrix$FDR,
                      filename = "DEGs_volcano.png",
                      x.cut = 7,
                      y.cut = 10^-5,
                      names = rownames(DEGsmatrix),
                      color = c("black","red","dodgerblue3"),
                      names.size = 2,
                      xlab = " Gene expression fold change (Log2)",
                      legend = "State",
                      title = "Volcano plot (Normal NT vs Tumor TP)",
                      width = 10)

The figure generated by the code above is shown below:

`LPA`: Literature Phenotype Analysis

The user can perform a literature phenotype analysis using the function LPA, for example for the biological process ‘apoptosis’.

data(DEGsmatrix)

BPselected <- "apoptosis"

BPannotations <- DiseaseList[[match(BPselected,names(DiseaseList))]]$ID
dataLPA <- LPA(dataDEGs = DEGsmatrix[1:5,],
               BP =  BPselected,
               BPlist = BPannotations)
DiseaseListNew <- dataLPA
names(DiseaseListNew) <- BPselected

`FEA`: Functional Enrichment Analysis

The user can perform a functional enrichment analysis using the function FEAcomplete. For each DEG in the gene set a z-score is calculated. This score indicates how the genes act in the gene set.

data(DEGsmatrix)
dataFEA <- FEA(DEGsmatrix = DEGsmatrix)

The output can be visualized with a FEA plot.

`FEAplot`: Functional Enrichment Analysis Plot

The user can plot the result of a functional enrichment analysis using the function plotFEA. A negative z-score indicates that the process’ activity is decreased. A positive z-score indicates that the process’ activity is increased.

plotFEA(dataFEA = dataFEA, additionalFilename = "_exampleVignette", height = 20, width = 10)

The figure generated by the above code is shown below:

`GRN`: Gene Regulatory Network

The user can perform a gene regulatory network analysis using the function GRN which infers the network using the parmigene package.

dataGRN <- GRN(TFs = rownames(DEGsmatrix)[1:100], normCounts = dataFilt,
                   nGenesPerm = 10,kNearest = 3,nBoot = 10)

`URA`: Upstream Regulator Analysis

The user can perform upstream regulator analysis using the function URA. This function is applied to each DEG in the enriched gene set and its neighbors in the GRN.

data(dataGRN)
data(DEGsmatrix)
dataURA <- URA(dataGRN = dataGRN,
               DEGsmatrix = DEGsmatrix,
               BPname = NULL, nCores=2)

`PRA`: Pattern Regognition Analysis

The user can retrieve TSG/OCG candidates using either selected biological processes or a random forest classifier trained on known COSMIC OCGs/TSGs.

data(dataURA)
dataDual <- PRA(dataURA = dataURA,
                          BPname = c("apoptosis","proliferation of cells"),
                          thres.role = 0)

`plotNetworkHive`: GRN hive visualization taking into account Cosmic cancer genes

In the following plot the nodes are separated into three groups: known tumor suppressor genes (yellow), known oncogenes (green) and the rest (gray).

data(knownDriverGenes)
data(dataGRN)
plotNetworkHive(dataGRN, knownDriverGenes, 0.55)

TCGA Downstream Analysis: Case Studies

Introduction

This vignette shows a complete workflow of the ‘MoonlightR’ package except for the data download. The code is divided into three case study:

1. Downstream analysis LUAD using RNA expression data
1. Expression pipeline Pan Cancer with five cancer types
1. Expression pipeline with stages I II III IV (BRCA)

Case study n. 1: Downstream analysis LUAD

dataDEGs <- DPA(dataFilt = dataFilt,
                dataType = "Gene expression")

dataFEA <- FEA(DEGsmatrix = dataDEGs)

dataGRN <- GRN(TFs = rownames(dataDEGs)[1:100], 
               DEGsmatrix = dataDEGs,
               DiffGenes = TRUE,
               normCounts = dataFilt)

dataURA <- URA(dataGRN = dataGRN, 
              DEGsmatrix = dataDEGs, 
              BPname = c("apoptosis",
                         "proliferation of cells"))

dataDual <- PRA(dataURA = dataURA, 
               BPname = c("apoptosis",
                          "proliferation of cells"),
               thres.role = 0)

CancerGenes <- list("TSG"=names(dataDual$TSG), "OCG"=names(dataDual$OCG))

`plotURA`: Upstream regulatory analysis plot

The user can plot the result of the upstream regulatory analysis using the function plotURA.

 plotURA(dataURA = dataURA[c(names(dataDual$TSG), names(dataDual$OCG)),, drop = FALSE], additionalFilename = "_exampleVignette")

The figure resulted from the code above is shown below:

Case study n. 2: Expression pipeline Pan Cancer 5 cancer types

cancerList <- c("BLCA","COAD","ESCA","HNSC","STAD")

listMoonlight <- moonlight(cancerType = cancerList, 
                      dataType = "Gene expression",
                      directory = "data",
                      nSample = 10,
                      nTF = 100,
                      DiffGenes = TRUE,
                      BPname = c("apoptosis","proliferation of cells"))
save(listMoonlight, file = paste0("listMoonlight_ncancer4.Rdata"))

`plotCircos`: Moonlight Circos Plot

The results of the moonlight pipeline can be visualized with a circos plot. Outer ring: color by cancer type, Inner ring: OCGs and TSGs, Inner connections: green: common OCGs yellow: common TSGs red: possible dual role

plotCircos(listMoonlight = listMoonlight, additionalFilename = "_ncancer5")

The figure generated by the code above is shown below:

Case study n. 3: Downstream analysis BRCA with stages

listMoonlight <- NULL
for (i in 1:4){
    dataDual <- moonlight(cancerType = "BRCA", 
                      dataType = "Gene expression",
                      directory = "data",
                      nSample = 10,
                      nTF = 5,
                      DiffGenes = TRUE,
                      BPname = c("apoptosis","proliferation of cells"),
                      stage = i)
    listMoonlight <- c(listMoonlight, list(dataDual))
    save(dataDual, file = paste0("dataDual_stage",as.roman(i), ".Rdata"))
}
names(listMoonlight) <- c("stage1", "stage2", "stage3", "stage4")

# Prepare mutation data for stages

mutation <- GDCquery_Maf(tumor = "BRCA")

res.mutation <- NULL
for(stage in 1:4){
  
  curStage <- paste0("Stage ", as.roman(stage))
                dataClin$tumor_stage <- toupper(dataClin$tumor_stage)
                dataClin$tumor_stage <- gsub("[ABCDEFGH]","",dataClin$tumor_stage)
                dataClin$tumor_stage <- gsub("ST","Stage",dataClin$tumor_stage)

                dataStg <- dataClin[dataClin$tumor_stage %in% curStage,]
                message(paste(curStage, "with", nrow(dataStg), "samples"))
dataSmTP <- mutation$Tumor_Sample_Barcode
                
                dataStgC <- dataSmTP[substr(dataSmTP,1,12) %in% dataStg$bcr_patient_barcode]
                dataSmTP <- dataStgC
  
                info.mutation <- mutation[mutation$Tumor_Sample_Barcode %in% dataSmTP,]
  
     ind <- which(info.mutation[,"Consequence"]=="inframe_deletion")
     ind2 <- which(info.mutation[,"Consequence"]=="inframe_insertion")
     ind3 <- which(info.mutation[,"Consequence"]=="missense_variant")
    res.mutation <- c(res.mutation, list(info.mutation[c(ind, ind2, ind3),c(1,51)]))
    }
names(res.mutation) <- c("stage1", "stage2", "stage3", "stage4")


tmp <- NULL
tmp <- c(tmp, list(listMoonlight[[1]][[1]]))
tmp <- c(tmp, list(listMoonlight[[2]][[1]]))
tmp <- c(tmp, list(listMoonlight[[3]][[1]]))
tmp <- c(tmp, list(listMoonlight[[4]][[1]]))
names(tmp) <- names(listMoonlight)

 mutation <- GDCquery_Maf(tumor = "BRCA")    

 plotCircos(listMoonlight=listMoonlight,listMutation=res.mutation, additionalFilename="proc2_wmutation", intensityColDual=0.2,fontSize = 2)

The figure generated by the code above is shown below:

Session Information ******

sessionInfo()

## R version 3.4.2 (2017-09-28)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.3 LTS
## 
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.6-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.6-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] grid      parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
## [1] TCGAbiolinks_2.6.0 png_0.1-7          MoonlightR_1.4.0  
## [4] doParallel_1.0.11  iterators_1.0.8    foreach_1.4.3     
## [7] BiocStyle_2.6.0   
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.1.1             circlize_0.4.1             
##   [3] fastmatch_1.1-0             aroma.light_3.8.0          
##   [5] plyr_1.8.4                  igraph_1.1.2               
##   [7] selectr_0.3-1               ConsensusClusterPlus_1.42.0
##   [9] lazyeval_0.2.1              splines_3.4.2              
##  [11] BiocParallel_1.12.0         GenomeInfoDb_1.14.0        
##  [13] ggplot2_2.2.1               digest_0.6.12              
##  [15] htmltools_0.3.6             GOSemSim_2.4.0             
##  [17] GO.db_3.4.2                 gdata_2.18.0               
##  [19] magrittr_1.5                memoise_1.1.0              
##  [21] cluster_2.0.6               limma_3.34.0               
##  [23] ComplexHeatmap_1.16.0       Biostrings_2.46.0          
##  [25] readr_1.1.1                 annotate_1.56.0            
##  [27] matrixStats_0.52.2          R.utils_2.5.0              
##  [29] prettyunits_1.0.2           jpeg_0.1-8                 
##  [31] colorspace_1.3-2            rvest_0.3.2                
##  [33] ggrepel_0.7.0               blob_1.1.0                 
##  [35] dplyr_0.7.4                 tcltk_3.4.2                
##  [37] RCurl_1.95-4.8              jsonlite_1.5               
##  [39] roxygen2_6.0.1              genefilter_1.60.0          
##  [41] GEOquery_2.46.0             bindr_0.1                  
##  [43] zoo_1.8-0                   survival_2.41-3            
##  [45] glue_1.2.0                  survminer_0.4.0            
##  [47] gtable_0.2.0                zlibbioc_1.24.0            
##  [49] XVector_0.18.0              GetoptLong_0.1.6           
##  [51] DelayedArray_0.4.0          shape_1.4.3                
##  [53] BiocGenerics_0.24.0         scales_0.5.0               
##  [55] DOSE_3.4.0                  HiveR_0.3.42               
##  [57] DESeq_1.30.0                edgeR_3.20.0               
##  [59] DBI_0.7                     ggthemes_3.4.0             
##  [61] Rcpp_0.12.13                cmprsk_2.2-7               
##  [63] xtable_1.8-2                progress_1.1.2             
##  [65] foreign_0.8-69              matlab_1.0.2               
##  [67] bit_1.1-12                  km.ci_0.5-2                
##  [69] stats4_3.4.2                htmlwidgets_0.9            
##  [71] httr_1.3.1                  fgsea_1.4.0                
##  [73] gplots_3.0.1                RColorBrewer_1.1-2         
##  [75] pkgconfig_2.0.1             XML_3.98-1.9               
##  [77] R.methodsS3_1.7.1           RISmed_2.1.7               
##  [79] locfit_1.5-9.1              labeling_0.3               
##  [81] rlang_0.1.2                 reshape2_1.4.2             
##  [83] AnnotationDbi_1.40.0        munsell_0.4.3              
##  [85] tools_3.4.2                 downloader_0.4             
##  [87] RSQLite_2.0                 broom_0.4.2                
##  [89] tkrgl_0.7                   devtools_1.13.3            
##  [91] evaluate_0.10.1             stringr_1.2.0              
##  [93] yaml_2.1.14                 knitr_1.17                 
##  [95] bit64_0.9-7                 survMisc_0.5.4             
##  [97] rgl_0.98.1                  caTools_1.17.1             
##  [99] purrr_0.2.4                 randomForest_4.6-12        
## [101] bindrcpp_0.2                nlme_3.1-131               
## [103] EDASeq_2.12.0               mime_0.5                   
## [105] R.oo_1.21.0                 DO.db_2.9                  
## [107] xml2_1.1.1                  biomaRt_2.34.0             
## [109] compiler_3.4.2              tibble_1.3.4               
## [111] geneplotter_1.56.0          stringi_1.1.5              
## [113] highr_0.6                   GenomicFeatures_1.30.0     
## [115] lattice_0.20-35             Matrix_1.2-11              
## [117] psych_1.7.8                 commonmark_1.4             
## [119] KMsurv_0.1-5                GlobalOptions_0.0.12       
## [121] data.table_1.10.4-3         bitops_1.0-6               
## [123] parmigene_1.0.2             rtracklayer_1.38.0         
## [125] httpuv_1.3.5                GenomicRanges_1.30.0       
## [127] qvalue_2.10.0               latticeExtra_0.6-28        
## [129] hwriter_1.3.2               R6_2.2.2                   
## [131] ShortRead_1.36.0            bookdown_0.5               
## [133] RMySQL_0.10.13              KernSmooth_2.23-15         
## [135] gridExtra_2.3               IRanges_2.12.0             
## [137] codetools_0.2-15            gtools_3.5.0               
## [139] assertthat_0.2.0            SummarizedExperiment_1.8.0 
## [141] rprojroot_1.2               rjson_0.2.15               
## [143] withr_2.0.0                 mnormt_1.5-5               
## [145] GenomicAlignments_1.14.0    Rsamtools_1.30.0           
## [147] S4Vectors_0.16.0            GenomeInfoDbData_0.99.1    
## [149] hms_0.3                     clusterProfiler_3.6.0      
## [151] tidyr_0.7.2                 rmarkdown_1.6              
## [153] rvcheck_0.0.9               ggpubr_0.1.5               
## [155] Biobase_2.38.0              shiny_1.0.5

References

Colaprico, Antonio and Silva, Tiago C. and Olsen, Catharina and Garofano, Luciano and Cava, Claudia and Garolini, Davide and Sabedot, Thais S. and Malta, Tathiane M. and Pagnotta, Stefano M. and Castiglioni, Isabella and Ceccarelli, Michele and Bontempi, Gianluca and Noushmehr, Houtan. 2016. “TCGAbiolinks: An R/Bioconductor Package for Integrative Analysis of TCGA Data.” doi:10.1093/nar/gkv1507.

Silva, TC and Colaprico, A and Olsen, C and D’Angelo, F and Bontempi, G and Ceccarelli, M and Noushmehr, H. 2016. “TCGA Workflow: Analyze Cancer Genomics and Epigenomics Data Using Bioconductor Packages.” doi:10.12688/f1000research.8923.1.

Moonlight: an approach to identify multiple role of biomarkers as oncogene or tumorsuppressor in different tumor types and stages.

⁺ These authors contributed equally to the paper as first authors.

2017-10-30

Contents

Abstract

Introduction

Moonlight’s pipeline

Moonlight’s proposed workflow

Installation

Citation

`Download`: Get TCGA data

`getDataTCGA`: Search by cancer type and data type [Gene Expression]

`getDataTCGA`: Search by cancer type and data type [Methylation]

`Download`: Get GEO data

`getDataGEO`: Search by cancer type and data type [Gene Expression]

`Analysis`: To analyze TCGA data

`DPA`: Differential Phenotype Analysis

`LPA`: Literature Phenotype Analysis

`FEA`: Functional Enrichment Analysis

`FEAplot`: Functional Enrichment Analysis Plot

`GRN`: Gene Regulatory Network

`URA`: Upstream Regulator Analysis

`PRA`: Pattern Regognition Analysis

`plotNetworkHive`: GRN hive visualization taking into account Cosmic cancer genes

TCGA Downstream Analysis: Case Studies

Introduction

Case study n. 1: Downstream analysis LUAD

`plotURA`: Upstream regulatory analysis plot

Case study n. 2: Expression pipeline Pan Cancer 5 cancer types

`plotCircos`: Moonlight Circos Plot

Case study n. 3: Downstream analysis BRCA with stages

References

Moonlight: an approach to identify multiple role of biomarkers as oncogene or tumorsuppressor in different tumor types and stages.

+ These authors contributed equally to the paper as first authors.

2017-10-30

Contents

Abstract

Introduction

Moonlight’s pipeline

Moonlight’s proposed workflow

Installation

Citation

Download: Get TCGA data

getDataTCGA: Search by cancer type and data type [Gene Expression]

getDataTCGA: Search by cancer type and data type [Methylation]

Download: Get GEO data

getDataGEO: Search by cancer type and data type [Gene Expression]

Analysis: To analyze TCGA data

DPA: Differential Phenotype Analysis

LPA: Literature Phenotype Analysis

FEA: Functional Enrichment Analysis

FEAplot: Functional Enrichment Analysis Plot

GRN: Gene Regulatory Network

URA: Upstream Regulator Analysis

PRA: Pattern Regognition Analysis

plotNetworkHive: GRN hive visualization taking into account Cosmic cancer genes

TCGA Downstream Analysis: Case Studies

Introduction

Case study n. 1: Downstream analysis LUAD

plotURA: Upstream regulatory analysis plot

Case study n. 2: Expression pipeline Pan Cancer 5 cancer types

plotCircos: Moonlight Circos Plot

Case study n. 3: Downstream analysis BRCA with stages

References

⁺ These authors contributed equally to the paper as first authors.

`Download`: Get TCGA data

`getDataTCGA`: Search by cancer type and data type [Gene Expression]

`getDataTCGA`: Search by cancer type and data type [Methylation]

`Download`: Get GEO data

`getDataGEO`: Search by cancer type and data type [Gene Expression]

`Analysis`: To analyze TCGA data

`DPA`: Differential Phenotype Analysis

`LPA`: Literature Phenotype Analysis

`FEA`: Functional Enrichment Analysis

`FEAplot`: Functional Enrichment Analysis Plot

`GRN`: Gene Regulatory Network

`URA`: Upstream Regulator Analysis

`PRA`: Pattern Regognition Analysis

`plotNetworkHive`: GRN hive visualization taking into account Cosmic cancer genes

`plotURA`: Upstream regulatory analysis plot

`plotCircos`: Moonlight Circos Plot