DTEG.analysis {ORFik} | R Documentation |
Creates a total of 3 DESeq models (given x is design argument input
(usually stage or condition) and libraryType is RNA-seq and Ribo-seq):
1. Ribo-seq model: design = ~ x (differences between the x groups in Ribo-seq)
2. RNA-seq model: design = ~ x (differences between the x groups in RNA-seq)
3. TE model: design = ~ x + libraryType + libraryType:x
(differences between the x and libraryType groups and the interaction between them)
Using an equal reimplementation of the deltaTE algorithm (see reference).
You need at least 2 groups and 2 replicates per group. The Ribo-seq counts will
be over CDS and RNA-seq over mRNAs, per transcript.
DTEG.analysis( df.rfp, df.rna, output.dir = paste0(dirname(df.rfp$filepath[1]), "/QC_STATS/"), design = "stage", p.value = 0.05, RFP_counts = countTable(df.rfp, "cds", type = "summarized"), RNA_counts = countTable(df.rna, "mrna", type = "summarized"), batch.effect = FALSE, pairs = combn.pairs(unlist(df.rfp[, design])), plot.title = "", plot.ext = ".pdf", width = 6, height = 6, dot.size = 0.4, relative.name = paste0("DTEG_plot", plot.ext), complex.categories = FALSE )
df.rfp |
a |
df.rna |
a |
output.dir |
output.dir directory to save plots, plot will be named "TE_between". If NULL, will not save. |
design |
a character vector, default "stage". The columns in the ORFik experiment that represent the comparison contrasts. Usually found in "stage", "condition" or "fraction" column. |
p.value |
a numeric, default 0.05 in interval (0,1) or "" to not show. What p-value used for the analysis? Will be shown as a caption. |
RFP_counts |
a SummarizedExperiment, default:
|
RNA_counts |
a SummarizedExperiment, default: countTable(df.rna, "mrna", type = "summarized"), all transcripts. Assign a subset if you don't want to analyze all genes. It is recommended to not subset, to give DESeq2 data for variance analysis. |
batch.effect, |
logical, default FALSE. If you believe you might have batch effects, set to TRUE, will use replicate column to represent batch effects. Batch effect usually means that you have a strong variance between biological replicates. Check PCA plot on count tables to verify if you need to set it to TRUE. |
pairs |
list of character pairs, the experiment contrasts. Default:
|
plot.title |
title for plots, usually name of experiment etc |
plot.ext |
character, default: ".pdf". Alternatives: ".png" or ".jpg". |
width |
numeric, default 6 (in inches) |
height |
numeric, default 6 (in inches) |
dot.size |
numeric, default 0.4, size of point dots in plot. |
relative.name |
character, Default: "DTEG_plot.pdf". Relative name of file to be saved in folder specified in output.dir. Change to .pdf if you want pdf file instead of png. |
complex.categories |
logical, default FALSE. Seperate into more groups, will add Inverse (opposite diagonal of mRNA abundance) and Expression (only significant mRNA-seq) |
The respective groups are defined as this
(given a user defined p value, shown here as 0.05):
1. Translation - te.p.adj < 0.05 & rfp.p.adj < 0.05 & rna.p.adj > 0.05
2. mRNA abundance - te.p.adj > 0.05 & rfp.p.adj < 0.05 & rna.p.adj > 0.05
3. Buffering - te.p.adj < 0.05 & rfp.p.adj > 0.05 & rna.p.adj > 0.05
Buffering also adds some close group which are split up if you set
complex.categories = TRUE (You will then get in addition)
See Figure 1 in the reference article for a clear definition of the groups!
If you do not need isoform variants, subset to longest isoform per gene
either before or in the returned object (See examples).
If you do not have RNA-seq controls, you can still use DESeq on Ribo-seq alone.
The LFC values are shrunken by lfcShrink(type = "normal").
Remember that DESeq by default can not
do global change analysis, it can only find subsets with changes in LFC!
a data.table with 9 columns. (log fold changes, p.ajust values, group, regulation status and gene id)
doi: 10.1002/cpmb.108
Other TE:
DTEG.plot()
,
te.table()
,
te_rna.plot()
## Simple example #df.rfp <- read.experiment("Riboseq") #df.rna <- read.experiment("RNAseq") #dt <- DTEG.analysis(df.rfp, df.rna) ## If you want to use the pshifted libs for analysis: #dt <- DTEG.analysis(df.rfp, df.rna, # RFP_counts = countTable(df.rfp, region = "cds", # type = "summarized", count.folder = "pshifted")) ## Restrict DTEGs by log fold change (LFC): ## subset to abs(LFC) < 1.5 for both rfp and rna #dt[abs(rfp) < 1.5 & abs(rna) < 1.5, Regulation := "No change"] ## Only longest isoform per gene: #tx_longest <- filterTranscripts(df.rfp, 0, 1, 0) #dt <- dt[id %in% tx_longest,] ## Convert to gene id #dt[, id := txNamesToGeneNames(id, df.rfp)] ## To get by gene symbol, use biomaRt conversion ## To flip directionality of contrast pair nr 2: #design <- "condition" #pairs <- combn.pairs(unlist(df.rfp[, design]) #pairs[[2]] <- rev(pars[[2]]) #dt <- DTEG.analysis(df.rfp, df.rna, # RFP_counts = countTable(df.rfp, region = "cds", # type = "summarized", count.folder = "pshifted"), # pairs = pairs)