scRNAseq 2.0.2
The scRNAseq package provides convenient access to several publicly available data sets
in the form of SingleCellExperiment
objects.
The focus of this package is to capture datasets that are not easily read into R with a one-liner from, e.g., read.csv()
.
Instead, we do the necessary data munging so that users only need to call a single function to obtain a well-formed SingleCellExperiment
.
For example:
library(scRNAseq)
fluidigm <- ReprocessedFluidigmData()
fluidigm
## class: SingleCellExperiment
## dim: 26255 130
## metadata(3): sample_info clusters which_qc
## assays(4): tophat_counts cufflinks_fpkm rsem_counts rsem_tpm
## rownames(26255): A1BG A1BG-AS1 ... ZZEF1 ZZZ3
## rowData names(0):
## colnames(130): SRR1275356 SRR1274090 ... SRR1275366 SRR1275261
## colData names(28): NREADS NALIGNED ... Cluster1 Cluster2
## reducedDimNames(0):
## spikeNames(0):
## altExpNames(0):
Readers are referred to the SummarizedExperiment and SingleCellExperiment documentation
for further information on how to work with SingleCellExperiment
objects.
The available data sets can be split into two categories. The first category contains expression matrices that have been generated by the scRNAseq authors from the raw sequencing data for each experiment. This includes:
ReprocessedFluidigmData()
provides 65 cells from Pollen et al. (2014).ReprocessedTh2Data()
provides 96 T helper cells from Mahata et al. (2014).ReprocessedAllenData()
provides 379 cells from the mouse visual cortex,
which is a subset of the data from Tasic et al. (2016).The second category contains expression matrices that were provided by the authors of each study. No further reprocessing has been performed other than some cross-checks betweeh the count matrix and the sample metadata.
Study | System | Number of cells | Function |
---|---|---|---|
Aztekin et al. (2019) | Xenopus tail | 13199 | AztekinTailData() |
Bach et al. (2017) | Mouse mammary gland | 25806 | BachMammaryData() |
Baron et al. (2016) | Human pancreas | 8569 | BaronPancreasData('human') |
Baron et al. (2016) | Mouse pancreas | 1886 | BaronPancreasData('mouse') |
Buettner et al. (2015) | Mouse embryonic stem cells | 288 | BuettnerESCData() |
Campbell et al. (2017) | Mouse brain | 21086 | CampbellBrainData() |
Chen et al. (2017) | Mouse brain | 14437 | ChenBrainData() |
Grun et al. (2016) | Mouse haematopoietic stem cells | 1915 | GrunHSCData() |
Grun et al. (2016) | Human pancreas | 1728 | GrunPancreasData() |
Kolodziejczyk et al. (2015) | Mouse mebryonic stem cells | 704 | KolodziejczykESCData() |
La Manno et al. (2016) | Human embryonic stem cells | 1715 | LaMannoBrainData('human-es') |
La Manno et al. (2016) | Human embryonic midbrain | 1977 | LaMannoBrainData('human-embryo') |
La Manno et al. (2016) | Human induced pluripotent stem cells | 337 | LaMannoBrainData('human-ips') |
La Manno et al. (2016) | Mouse adult dopaminergic neurons | 243 | LaMannoBrainData('mouse-adult') |
La Manno et al. (2016) | Human embyronic midbrain | 1907 | LaMannoBrainData('mouse-embryo') |
Lawlor et al. (2017) | Human pancreas | 638 | LawlorPancreasData() |
Leng et al. (2015) | Human embryonic stem cells | 460 | LengESCData() |
Lun et al. (2017) | 416B cells | 192 | LunSpikeInData('416b') |
Lun et al. (2017) | Mouse trophoblasts | 192 | LunSpikeInData('tropho') |
Macosko et al. (2015) | Mouse retina | 49300 | MacoskoRetinaData() |
Marques et al. (2016) | Mouse brain | 5069 | MarquesBrainData() |
Messmer et al. (2019) | Human embryonic stem cells | 1344 | MessmerESCData() |
Muraro et al. (2016) | Human pancreas | 3072 | MuraroPancreasData() |
Nestorowa et al. (2016) | Mouse haematopoietic stem cells | 1920 | NestorowaHSCData() |
Richard et al. (2018) | Mouse CD8+ T cells | 572 | RichardTCellData() |
Romanov et al. (2017) | Mouse brain | 2881 | RomanovBrainData() |
Segerstolpe et al. (2016) | Human pancreas | 3514 | SegerstolpePancreasData() |
Shekhar et al. (2016) | Mouse retina | 44994 | ShekharRetinaData() |
Usoskin et al. (2015) | Mouse brain | 864 | UsoskinBrainData() |
Tasic et al. (2016) | Mouse brain | 1809 | TasicBrainData() |
Xin et al. (2016) | Human pancreas | 1600 | XinPancreasData() |
Zeisel et al. (2015) | Mouse brain | 3005 | ZeiselBrainData() |
Please contact us if you have a data set that you would like to see added to this package. The only requirement is that your data set has publicly available expression values (ideally counts) and sample annotation. The more difficult/custom the format, the better, as its inclusion in this package will provide more value for other users in the R/Bioconductor community.
If you have already written code that processes your desired data set in a SingleCellExperiment
-like form,
we would welcome a pull request here.
The process can be expedited by ensuring that you have the following files:
inst/scripts/make-X-Y-data.Rmd
, a Rmarkdown report that creates all components of a SingleCellExperiment
.
X
should be the last name of the first author of the relevant study while Y
should be the name of the biological system.inst/scripts/make-X-Y-metadata.R
, an R script that creates a metadata CSV file at inst/extdata/metadata-X-Y.csv
.
Metadata files should follow the format described in the ExperimentHub documentation.R/XYData.R
, an R source file that defines a function XYData()
to download the components from ExperimentHub
and creates a SingleCellExperiment
object.Potential contributors are recommended to examine some of the existing scripts in the package to pick up the coding conventions. Remember, we’re more likely to accept a contribution if it’s indistinguishable from something we might have written ourselves!
As a general rule, 10X Genomics data sets are not suitable for inclusion in this package. They are either easy to load (e.g., with functions from the DropletUtils package), or they are more appropriately obtained with dedicated 10X packages like TENxPBMCData or TENxBrainData. That said, inclusion will be considered if the format has been sufficiently customized by the original authors.
Aztekin, C., T. W. Hiscock, J. C. Marioni, J. B. Gurdon, B. D. Simons, and J. Jullien. 2019. “Identification of a regeneration-organizing cell in the Xenopus tail.” Science 364 (6441):653–58.
Bach, K., S. Pensa, M. Grzelak, J. Hadfield, D. J. Adams, J. C. Marioni, and W. T. Khaled. 2017. “Differentiation dynamics of mammary epithelial cells revealed by single-cell RNA sequencing.” Nat Commun. 8 (1):2128.
Baron, M., A. Veres, S. L. Wolock, A. L. Faust, R. Gaujoux, A. Vetere, J. H. Ryu, et al. 2016. “A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure.” Cell Syst 3 (4):346–60.
Buettner, F., K. N. Natarajan, F. P. Casale, V. Proserpio, A. Scialdone, F. J. Theis, S. A. Teichmann, J. C. Marioni, and O. Stegle. 2015. “Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.” Nat. Biotechnol. 33 (2):155–60.
Campbell, J. N., E. Z. Macosko, H. Fenselau, T. H. Pers, A. Lyubetskaya, D. Tenen, M. Goldman, et al. 2017. “A molecular census of arcuate hypothalamus and median eminence cell types.” Nat. Neurosci. 20 (3):484–96.
Chen, R., X. Wu, L. Jiang, and Y. Zhang. 2017. “Single-Cell RNA-Seq Reveals Hypothalamic Cell Diversity.” Cell Rep 18 (13):3227–41.
Grun, D., M. J. Muraro, J. C. Boisset, K. Wiebrands, A. Lyubimova, G. Dharmadhikari, M. van den Born, et al. 2016. “De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data.” Cell Stem Cell 19 (2):266–77.
Kolodziejczyk, A. A., J. K. Kim, J. C. Tsang, T. Ilicic, J. Henriksson, K. N. Natarajan, A. C. Tuck, et al. 2015. “Single cell RNA-Sequencing of pluripotent states unlocks modular transcriptional variation.” Cell Stem Cell 17 (4):471–85.
La Manno, G., D. Gyllborg, S. Codeluppi, K. Nishimura, C. Salto, A. Zeisel, L. E. Borm, et al. 2016. “Molecular Diversity of Midbrain Development in Mouse, Human, and Stem Cells.” Cell 167 (2):566–80.
Lawlor, N., J. George, M. Bolisetty, R. Kursawe, L. Sun, V. Sivakamasundari, I. Kycia, P. Robson, and M. L. Stitzel. 2017. “Single-cell transcriptomes identify human islet cell signatures and reveal cell-type-specific expression changes in type 2 diabetes.” Genome Res. 27 (2):208–22.
Leng, N., L. F. Chu, C. Barry, Y. Li, J. Choi, X. Li, P. Jiang, R. M. Stewart, J. A. Thomson, and C. Kendziorski. 2015. “Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments.” Nat. Methods 12 (10):947–50.
Lun, A. T. L., F. J. Calero-Nieto, L. Haim-Vilmovsky, B. Gottgens, and J. C. Marioni. 2017. “Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data.” Genome Res. 27 (11):1795–1806.
Macosko, E. Z., A. Basu, R. Satija, J. Nemesh, K. Shekhar, M. Goldman, I. Tirosh, et al. 2015. “Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets.” Cell 161 (5):1202–14.
Mahata, B., X. Zhang, A. A. Kolodziejczyk, V. Proserpio, L. Haim-Vilmovsky, A. E. Taylor, D. Hebenstreit, et al. 2014. “Single-cell RNA sequencing reveals T helper cells synthesizing steroids de novo to contribute to immune homeostasis.” Cell Rep 7 (4):1130–42.
Marques, S., A. Zeisel, S. Codeluppi, D. van Bruggen, A. Mendanha Falcao, L. Xiao, H. Li, et al. 2016. “Oligodendrocyte heterogeneity in the mouse juvenile and adult central nervous system.” Science 352 (6291):1326–9.
Messmer, T., F. von Meyenn, A. Savino, F. Santos, H. Mohammed, A. T. L. Lun, J. C. Marioni, and W. Reik. 2019. “Transcriptional heterogeneity in naive and primed human pluripotent stem cells at single-cell resolution.” Cell Rep. 26 (4):815–24.
Muraro, M. J., G. Dharmadhikari, D. Grun, N. Groen, T. Dielen, E. Jansen, L. van Gurp, et al. 2016. “A Single-Cell Transcriptome Atlas of the Human Pancreas.” Cell Syst 3 (4):385–94.
Nestorowa, S., F. K. Hamey, B. Pijuan Sala, E. Diamanti, M. Shepherd, E. Laurenti, N. K. Wilson, D. G. Kent, and B. Gottgens. 2016. “A single-cell resolution map of mouse hematopoietic stem and progenitor cell differentiation.” Blood 128 (8):20–31.
Pollen, A. A., T. J. Nowakowski, J. Shuga, X. Wang, A. A. Leyrat, J. H. Lui, N. Li, et al. 2014. “Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.” Nat. Biotechnol. 32 (10):1053–8.
Richard, A. C., A. T. L. Lun, W. W. Y. Lau, B. Gottgens, J. C. Marioni, and G. M. Griffiths. 2018. “T cell cytolytic capacity is independent of initial stimulation strength.” Nat. Immunol. 19 (8):849–58.
Romanov, R. A., A. Zeisel, J. Bakker, F. Girach, A. Hellysaz, R. Tomer, A. Alpar, et al. 2017. “Molecular interrogation of hypothalamic organization reveals distinct dopamine neuronal subtypes.” Nat. Neurosci. 20 (2):176–88.
Segerstolpe, A., A. Palasantza, P. Eliasson, E. M. Andersson, A. C. Andreasson, X. Sun, S. Picelli, et al. 2016. “Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes.” Cell Metab. 24 (4):593–607.
Shekhar, K., S. W. Lapan, I. E. Whitney, N. M. Tran, E. Z. Macosko, M. Kowalczyk, X. Adiconis, et al. 2016. “Comprehensive Classification of Retinal Bipolar Neurons by Single-Cell Transcriptomics.” Cell 166 (5):1308–23.
Tasic, B., V. Menon, T. N. Nguyen, T. K. Kim, T. Jarsky, Z. Yao, B. Levi, et al. 2016. “Adult mouse cortical cell taxonomy revealed by single cell transcriptomics.” Nat. Neurosci. 19 (2):335–46.
Usoskin, D., A. Furlan, S. Islam, H. Abdo, P. Lonnerberg, D. Lou, J. Hjerling-Leffler, et al. 2015. “Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing.” Nat. Neurosci. 18 (1):145–53.
Xin, Y., J. Kim, H. Okamoto, M. Ni, Y. Wei, C. Adler, A. J. Murphy, G. D. Yancopoulos, C. Lin, and J. Gromada. 2016. “RNA Sequencing of Single Human Islet Cells Reveals Type 2 Diabetes Genes.” Cell Metab. 24 (4):608–15.
Zeisel, A., A. B. Munoz-Manchado, S. Codeluppi, P. Lonnerberg, G. La Manno, A. Jureus, S. Marques, et al. 2015. “Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq.” Science 347 (6226):1138–42.