Contents

1 Introduction

CSV files from the Ivy-GAP project have been assembled into a SummarizedExperiment instance.

## class: SummarizedExperiment 
## dim: 25873 270 
## metadata(5): README URL builder tumorDetails subBlockDetails
## assays(1): fpkm
## rownames(25873): A1BG A2M ... PP12719 LOC100653024
## rowData names(5): gene_id chromosome gene_entrez_id gene_symbol
##   gene_name
## colnames(270): 305273026 305405294 ... 305273038 306124458
## colData names(28): tumor_id tumor_name ... bam_download_link
##   bai_download_link

There are several types of metadata collected with the object, including the README.txt (use cat(metadata(ivySE)$README, sep="\n") to see this in R), the URL where data were retrieved, a character vector (builder) with the R code for creating (much of) the SummarizedExperiment, and two tables of tumor-specific and block-specific information.

2 Background on the ivyGlimpse app

The ivyGlimpse app is a rapid prototype of a browser-based interface to salient features of the data. The most current code is maintained in the Bioconductor ivygapSE package, but a public version of the app may be visited at shinyapps.io.

The ivygapSE package will evolve, based in part on associations observed through the use of this app. Briefly, the main visualization of the app is a scatterplot of user-selected tumor image features. All contributions, based on tumor sub-blocks (that have varying multiplicities per tumor block and donor) are assembled together without regard for source; interactive aspects of the display allow the user to see which donor contributes each point.

Strata can be formed interactively by brushing over the scatterplot; after the brushing event, the survival times of donors contributing selected points are compared to donors all of whose contributions lie outside the selection. Expression data are also stratified in this way and gene-specific boxplot sets (for user-specified gene sets) are produced for each stratum.

3 Summary information on the underlying data

The number of RNA-seq samples is 270. The FPKM matrix has dimensions

## [1] 25873   270

There are 42 different tumor donors.

## [1] 42

However, only 37 donors contributed tumor RNA that was sequenced:

## [1] 37

Features of images from sub-blocks were quantified according to the following terminology for anatomical characteristics. Not all images provided information on all attributes.