1 Basics

1.1 Install qsvaR

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. qsvaR is a R package available via the Bioconductor repository for packages. R can be installed on any operating system from CRAN after which you can install qsvaR by using the following commands in your R session:

## To install Bioconductor packages
if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("qsvaR")

## Check that you have a valid Bioconductor installation
BiocManager::valid()

## You can install the development version from GitHub with:
BiocManager::install("LieberInstitute/qsvaR")

1.2 Required knowledge

qsvaR is based on many other packages and in particular in those that have implemented the infrastructure needed for dealing with RNA-seq data. That is, packages like SummarizedExperiment. Here it might be useful for you to check the qSVA framework manuscript (Jaffe et al, PNAS, 2017).

If you are asking yourself the question “Where do I start using Bioconductor?” you might be interested in this blog post.

1.3 Asking for help

As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But R and Bioconductor have a steep learning curve so it is critical to learn where to ask for help. The blog post quoted above mentions some but we would like to highlight the Bioconductor support site as the main resource for getting help: remember to use the qsvaR tag and check the older posts. Other alternatives are available such as creating GitHub issues and tweeting. However, please note that if you want to receive help you should adhere to the posting guidelines. It is particularly critical that you provide a small reproducible example and your session information so package developers can track down the source of the error.

1.4 Citing qsvaR

We hope that qsvaR will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!

## Citation info
citation("qsvaR")
#> To cite package 'qsvaR' in publications use:
#> 
#>   Stolz JM, Tnani H, Collado-Torres L (2023). _qsvaR_.
#>   doi:10.18129/B9.bioc.qsvaR <https://doi.org/10.18129/B9.bioc.qsvaR>,
#>   https://github.com/LieberInstitute/qsvaR/qsvaR - R package version
#>   1.6.0, <http://www.bioconductor.org/packages/qsvaR>.
#> 
#>   Stolz JM, Tnani H, Tao R, Jaffe AE, Collado-Torres L (2023). "qsvaR."
#>   _bioRxiv_. doi:10.1101/TODO <https://doi.org/10.1101/TODO>,
#>   <https://www.biorxiv.org/content/10.1101/TODO>.
#> 
#> To see these entries in BibTeX format, use 'print(<citation>,
#> bibtex=TRUE)', 'toBibtex(.)', or set
#> 'options(citation.bibtex.max=999)'.

2 qsvaR Overview

library("qsvaR")
library("limma")
library("BiocFileCache")

2.1 Significant Transcripts

Differential expressions analysis requires the ability normalize complex datasets. In the case of postmortem brain tissue we are tasked with removing the effects of bench degradation. Our current work expands the scope of qSVA by generating degradation profiles (5 donors across 4 degradation time points: 0, 15, 30, and 60 minutes) from six human brain regions (n = 20 * 6 = 120): dorsolateral prefrontal cortex (DLPFC), hippocampus (HPC), medial prefrontal cortex (mPFC), subgenual anterior cingulate cortex (sACC), caudate, amygdala (AMY). We identified an average of 80,258 transcripts associated (FDR < 5%) with degradation time across the six brain regions. Testing for an interaction between brain region and degradation time identified 45,712 transcripts (FDR < 5%). A comparison of regions showed a unique pattern of expression changes associated with degradation time particularly in the DLPFC, implying that this region may not be representative of the effects of degradation on gene expression in other tissues. Furthermore previous work was done by analyzing expressed regions (Collado-Torres et al, NAR, 2017), and while this is an effective method of analysis, expressed regions are not a common output for many pipelines and are computationally expensive to identify, thus creating a barrier for the use of any qSVA software. In our most recent work expression quantification was performed at the transcript level using Salmon (Patro et al, Nat Methods, 2017). The changes from past work on qSVs to now is illustrated in the below cartoon.