Preparing an Alternative Splicing Annotation for PSIchomics

Nuno Saraiva Agostinho

6 October 2016


PSIchomics currently quantifies alternative splicing based on an alternative splicing annotation for Human (hg19). This annotation was created based on the annotations from MISO, SUPPA, VAST-TOOLS and rMATS.

The preparation of a new alternative splicing annotation for usage in PSIchomics starts by parsing the alternative splicing events. Currently, events from SUPPA, rMATS, MISO and VAST-TOOLS are parsable. Support can be extended for alternative splicing events from other programs. For more information, please contact Nuno Agostinho (nunodanielagostinho@gmail.com).

Also, do not forget to load the following packages:

library(psichomics)
## Loading required package: shiny
## Loading required package: shinyBS
## Start the visual interface by running the function psichomics()
library(plyr)

SUPPA annotation

SUPPA generates alternative splicing events based on a transcript annotation. Start by running SUPPA’s generateEvents script with a transcript file (GTF format) for all event types, if desired. See SUPPA’s page for more information.

The resulting output will include a directory containing tab-delimited files with alternative splicing events (one file for each event type). Hand the path of this directory to the function parseSuppaAnnotation(), prepare the annotation using prepareAnnotationFromEvents() and save the output to a RDS file:

# Replace `genome` for the string with the identifier before the first
# underscore in the filenames of that directory (for instance, if one of your 
# filenames of interest is "hg19_A3.ioe", the string would be "hg19")
suppa <- parseSuppaAnnotation(suppaOutput, genome="hg19")
## Retrieving SUPPA annotation...
## Parsing SUPPA annotation...
annot <- prepareAnnotationFromEvents(suppa)
## Sorting coordinates...
## [1] "A3SS SUPPA"
## [1] "A5SS SUPPA"
## [1] "AFE SUPPA"
## [1] "ALE SUPPA"
## [1] "MXE SUPPA"
## Joining events per event type...
## A3SS
## A5SS
## AFE
## ALE
## MXE
## RI
## SE
## Cleaning the annotation...
# suppaFile <- "suppa_hg19_annotation.RDS"
saveRDS(annot, file=suppaFile)

rMATS annotation

Just like SUPPA, rMATS also allows to generate alternative splicing events based on a transcript annotation, although two BAM or FASTQ files are required to generate alternative splicing events. Read rMATS’s page for more information.

The resulting output of rMATS is then handed out to the function parseMatsAnnotation():

mats <- parseMatsAnnotation(
    matsOutput,         # Output directory from rMATS
    genome = "fromGTF", # Identifier of the filenames
    novelEvents=TRUE)   # Parse novel events?
## Retrieving rMATS annotation...
## Parsing rMATS annotation...
annot <- prepareAnnotationFromEvents(mats)
## Sorting coordinates...
## [1] "A3SS MATS"
## [1] "A5SS MATS"
## [1] "AFE MATS"
## [1] "ALE MATS"
## [1] "MXE MATS"
## Joining events per event type...
## A3SS
## A5SS
## AFE
## ALE
## MXE
## RI
## SE
## Cleaning the annotation...
# matsFile <- "mats_hg19_annotation.RDS"
saveRDS(annot, file=matsFile)

MISO annotation

Simply retrieve MISO’s alternative splicing annotation and give the path to the downlaoded folder as input.

miso <- parseMisoAnnotation(misoAnnotation)
## Retrieving MISO annotation...
## Parsing MISO annotation...
annot <- prepareAnnotationFromEvents(miso)
## Sorting coordinates...
## [1] "A3SS MISO"
## [1] "A5SS MISO"
## [1] "AFE MISO"
## [1] "ALE MISO"
## [1] "MXE MISO"
## [1] "TandemUTR MISO"
## Joining events per event type...
## A3SS
## A5SS
## AFE
## ALE
## MXE
## RI
## SE
## TandemUTR
## Cleaning the annotation...
# misoFile <- "miso_AS_annotation_hg19.RDS"
saveRDS(annot, file=misoFile)

VAST-TOOLS annotation

Simply retrieve VAST-TOOLS’s alternative splicing annotation and give the path to the downlaoded folder as input. Note that, however, complex events (i.e. alternative coordinates for the exon ends) are not yet parseable.

vast <- parseVastToolsAnnotation(vastAnnotation)
## Retrieving VAST-TOOLS annotation...
## Parsing VAST-TOOLS annotation...
## ALT3
## ALT5
## COMBI
## IR
## MERGE3m
## MIC
## EXSK
## MULTI
annot <- prepareAnnotationFromEvents(vast)
## Sorting coordinates...
## [1] "A3SS VAST-TOOLS"
## [1] "A5SS VAST-TOOLS"
## Joining events per event type...
## A3SS
## A5SS
## RI
## SE
## Cleaning the annotation...
# vastFile <- "vast_AS_annotation_hg19.RDS"
saveRDS(annot, file=vastFile)

Combining annotation from different sources

To combine the annotation from different sources, provide the parsed annotations of interest simultasneously to the function prepareAnnotationFromEvents:

# Let's combine the annotation from SUPPA, MISO, rMATS and VAST-TOOLS
suppa <- parseSuppaAnnotation(suppaOutput, genome="hg19")
## Retrieving SUPPA annotation...
## Parsing SUPPA annotation...
miso <- parseMisoAnnotation(misoAnnotation)
## Retrieving MISO annotation...
## Parsing MISO annotation...
mats <- parseMatsAnnotation(matsOutput, genome = "fromGTF", novelEvents=TRUE)
## Retrieving rMATS annotation...
## Parsing rMATS annotation...
vast <- parseVastToolsAnnotation(vastAnnotation)
## Retrieving VAST-TOOLS annotation...
## Parsing VAST-TOOLS annotation...
## ALT3
## ALT5
## COMBI
## IR
## MERGE3m
## MIC
## EXSK
## MULTI
annot <- prepareAnnotationFromEvents(suppa, vast, mats, miso)
## Sorting coordinates...
## [1] "A3SS MATS"
## [1] "A3SS MISO"
## [1] "A3SS SUPPA"
## [1] "A3SS VAST-TOOLS"
## [1] "A5SS MATS"
## [1] "A5SS MISO"
## [1] "A5SS SUPPA"
## [1] "A5SS VAST-TOOLS"
## [1] "AFE MATS"
## [1] "AFE MISO"
## [1] "AFE SUPPA"
## [1] "ALE MATS"
## [1] "ALE MISO"
## [1] "ALE SUPPA"
## [1] "MXE MATS"
## [1] "MXE MISO"
## [1] "MXE SUPPA"
## [1] "TandemUTR MISO"
## Joining events per event type...
## A3SS
## A5SS
## AFE
## ALE
## MXE
## RI
## SE
## TandemUTR
## Cleaning the annotation...
# annotFile <- "AS_annotation_hg19.RDS"
saveRDS(annot, file=annotFile)

Quantifying alternative splicing using the created annotation

The created alternative splicing annotation can then be used when quantifying alternative splicing events. To do so, when using the GUI version of PSIchomics, be sure to select the Load annotation from file… option, click the button that appears below and select the recently created RDS file. Otherwise, if you are using the CLI version, perform the following steps:

annot <- readRDS(annotFile) # "file" is the path to the annotation file
junctionQuant <- readFile("ex_junctionQuant.RDS") # example set

psi <- quantifySplicing(annot, junctionQuant, 
                        c("SE", "MXE", "A3SS", "A5SS", "AFE", "ALE"))
## Calculating inclusion levels Skipped exon SE 6
## Calculating inclusion levels Mutually exclusive exon MXE 6
## Calculating inclusion levels Alternative 3' splice site A3SS 6
## Calculating inclusion levels Alternative 5' splice site A5SS 6
## Calculating inclusion levels Alternative first exon AFE 6
## Calculating inclusion levels Alternative last exon ALE 6
psi # may have 0 rows because of the small junction quantification set
##                                              Normal 1  Normal 2 Normal 3
## SE_1_+_23385660_23385840_23385851_23395032_ 0.7777778 0.4444444      0.6
##                                             Cancer 1  Cancer 2  Cancer 3
## SE_1_+_23385660_23385840_23385851_23395032_      0.4 0.4285714 0.5555556