1 Installation

source("https://bioconductor.org/biocLite.R")
BiocInstaller::biocLite("curatedTCGAData")

Load packages:

library(curatedTCGAData)
library(MultiAssayExperiment)

2 Downloading datasets

Checking available cancer codes and assays in TCGA data:

curatedTCGAData(diseaseCode = "*", assays = "*", dry.run = TRUE)
## Please see the list below for available cohorts and assays
## Available Cancer codes:
##  ACC BLCA BRCA CESC CHOL COAD DLBC ESCA GBM HNSC KICH
##  KIRC KIRP LAML LGG LIHC LUAD LUSC MESO OV PAAD PCPG
##  PRAD READ SARC SKCM STAD TGCT THCA THYM UCEC UCS UVM 
## Available Data Types:
##  CNACGH CNASNP CNASeq CNVSNP GISTICA GISTICT
##  Methylation Mutation RNASeq2GeneNorm
##  RNASeqGene RPPAArray mRNAArray miRNAArray
##  miRNASeqGene
## NULL

Check potential files to be downloaded:

curatedTCGAData(diseaseCode = "COAD", assays = "RPPA*", dry.run = TRUE)
##                COAD_RPPAArray 
## "COAD_RPPAArray-20160128.rda"

2.1 GBM dataset example

gbm <- curatedTCGAData("GBM", "RPPA*", FALSE)
gbm
## A MultiAssayExperiment object of 1 listed
##  experiment with a user-defined name and respective class. 
##  Containing an ExperimentList class object of length 1: 
##  [1] GBM_RPPAArray-20160128: SummarizedExperiment with 208 rows and 244 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

Note. For more on how to use a MultiAssayExperiment please see the MultiAssayExperiment vignette.

2.1.1 Subtype information

Some cancer datasets contain associated subtype information within the clinical datasets provided. This subtype information is included in the metadata of colData of the MultiAssayExperiment object. To obtain these variable names, run the metadata function on the colData of the object such as:

head(metadata(colData(gbm))[["subtypes"]])
##         GBM_annotations           GBM_subtype
## 1            Patient_ID                  Case
## 2  methylation_subtypes  MGMT promoter status
## 3     mutation_subtypes     IDH/codel subtype
## 4 histological_subtypes             Histology
## 5         mrna_subtypes      Original Subtype
## 6         mrna_subtypes Transcriptome Subtype