FRASER {FRASER}R Documentation

FRASER: Find RAre Splicing Events in RNA-seq data

Description

This help page describes the FRASER function which can be used run the default FRASER pipeline. This pipeline combines the beta-binomial fit, the computation of Z scores and p values as well as the computation of delta-PSI values.

Usage

FRASER(
  fds,
  q,
  implementation = c("PCA", "PCA-BB-Decoder", "AE-weighted", "AE", "BB"),
  iterations = 15,
  BPPARAM = bpparam(),
  correction,
  ...
)

calculateZscore(fds, type = currentType(fds), logit = TRUE)

calculatePvalues(
  fds,
  type = currentType(fds),
  implementation = "PCA",
  BPPARAM = bpparam(),
  distributions = c("betabinomial"),
  capN = 5 * 1e+05
)

calculatePadjValues(fds, type = currentType(fds), method = "BY")

fit(
  fds,
  implementation = c("PCA", "PCA-BB-Decoder", "AE", "AE-weighted", "PCA-BB-full",
    "fullAE", "PCA-regression", "PCA-reg-full", "PCA-BB-Decoder-no-weights", "BB"),
  q,
  type = "psi3",
  rhoRange = c(1e-08, 1 - 1e-08),
  weighted = FALSE,
  noiseAlpha = 1,
  convergence = 1e-05,
  iterations = 15,
  initialize = TRUE,
  control = list(),
  BPPARAM = bpparam(),
  nSubset = 15000,
  verbose = FALSE,
  minDeltaPsi = 0.1
)

Arguments

fds

A FraserDataSet object

q

The encoding dimensions to be used during the fitting proceadure. Should be fitted using optimHyperParams if unknown. If a named vector is provided it is used for the different splicing types.

implementation

The method that should be used to correct for confounders.

iterations

The maximal number of iterations. When the autoencoder has not yet converged after these number of iterations, the fit stops anyway.

BPPARAM

A BiocParallel object to run the computation in parallel

correction

Deprecated. The name changed to implementation.

...

Additional parameters passed on to the internal fit function

type

The type of PSI (psi5, psi3 or psiSite for theta/splicing efficiency)

logit

Indicates if z scores are computed on the logit scale (default) or in the natural (psi) scale.

distributions

The distribution based on which the p-values are calculated. Possible are beta-binomial, binomial and normal.

capN

Counts are capped at this value to speed up the p-value calculation

method

The p.adjust method that should be used.

rhoRange

Defines the range of values that rho parameter from the beta-binomial distribution is allowed to take. For very small values of rho, the loss can be instable, so it is not recommended to allow rho < 1e-8.

weighted

If TRUE, the weighted implementation of the autoencoder is used

noiseAlpha

Controls the amount of noise that is added for the denoising autoencoder.

convergence

The fit is considered to have converged if the difference between the previous and the current loss is smaller than this threshold.

initialize

If FALSE and a fit has been previoulsy run, the values from the previous fit will be used as initial values. If TRUE, (re-)initialization will be done.

control

List of control parameters passed on to optim().

nSubset

The size of the subset to be used in fitting if subsetting is used.

verbose

Controls the level of information printed during the fit.

minDeltaPsi

Minimal delta psi of an intron to be be considered a variable intron.

Details

All computed values are returned as an FraserDataSet object. To have more control over each analysis step, one can call each function separately.

Available methods to correct for the confounders are currently: a denoising autoencoder with a BB loss ("AE" and "AE-weighted"), PCA ("PCA"), a hybrid approach where PCA is used to fit the latent space and then the decoder of the autoencoder is fit using the BB loss ("PCA-BB-Decoder"). Although not recommended, it is also possible to directly fit the BB distrbution to the raw counts ("BB").

Value

FraserDataSet

Functions

Author(s)

Christian Mertes mertes@in.tum.de

Examples

   # On Windows SNOW is the default for the parallele backend, which can be 
   # very slow for many but small tasks. Therefore, we will use 
   # for the example the SerialParam() backend.
   if(.Platform$OS.type != "unix") {
       register(SerialParam())
   }
   
   # preprocessing
   fds <- createTestFraserDataSet()
  
   ### when running FRASER on a real dataset, one should run the following 
   ### two commands first (not run here to make the example run faster):
   # fds <- calculatePSIValues(fds)
   # fds <- filterExpressionAndVariability(fds)

   # Run the full analysis pipeline: fits distribution and computes p values
   fds <- FRASER(fds, q=2, implementation="PCA")

   # afterwards, the fitted fds-object can be saved and results can 
   # be extracted and visualized, see ?saveFraserDataSet, ?results and 
   # ?plotVolcano
   
   ### The functions run inside the FRASER function can also be directly 
   ### run themselves. 
   ### To directly run the fit function:
   # fds <- fit(fds, implementation="PCA", q=2, type="psi5")
   
   ### To directly run the nomial and adjusted p value and z score 
   ### calculation, the following functions can be used:
   # fds <- calculatePvalues(fds, type="psi5")
   # head(pVals(fds, type="psi5"))
   # fds <- calculatePadjValues(fds, type="psi5", method="BY")
   # head(padjVals(fds, type="psi5"))
   # fds <- calculateZscore(fds, type="psi5")
   # head(zScores(fds, type="psi5")) 


[Package FRASER version 1.0.2 Index]