if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("SingleCellMultiModal")
library(SingleCellMultiModal)
library(MultiAssayExperiment)
G&T-seq is a combination of Picoplex amplified gDNA sequencing (genome) and SMARTSeq2 amplified cDNA sequencing (transcriptome) of the same cell. For more information, see Macaulay et al. (2015).
The user can see the available dataset by using the default options
GTseq("mouse_embryo_8_cell", mode = "*", dry.run = TRUE)
## ah_id mode file_size rdataclass rdatadateadded
## 1 EH5431 genomic 0 Mb RaggedExperiment 2021-03-24
## 2 EH5433 transcriptomic 2.3 Mb SingleCellExperiment 2021-03-24
## rdatadateremoved
## 1 <NA>
## 2 <NA>
Or by simply running:
GTseq()
## ah_id mode file_size rdataclass rdatadateadded
## 1 EH5431 genomic 0 Mb RaggedExperiment 2021-03-24
## 2 EH5433 transcriptomic 2.3 Mb SingleCellExperiment 2021-03-24
## rdatadateremoved
## 1 <NA>
## 2 <NA>
To obtain the actual datasets:
gts <- GTseq(dry.run = FALSE)
gts
## A MultiAssayExperiment object of 2 listed
## experiments with user-defined names and respective classes.
## Containing an ExperimentList class object of length 2:
## [1] genomic: RaggedExperiment with 2366 rows and 112 columns
## [2] transcriptomic: SingleCellExperiment with 24029 rows and 112 columns
## Functionality:
## experiments() - obtain the ExperimentList instance
## colData() - the primary/phenotype DataFrame
## sampleMap() - the sample coordination DataFrame
## `$`, `[`, `[[` - extract colData columns, subset, or experiment
## *Format() - convert into a long or wide DataFrame
## assays() - convert ExperimentList to a SimpleList of matrices
## exportClass() - save data to flat files
Check available metadata for each of the 112 mouse embryo cells assayed by G&T-seq:
colData(gts)
## DataFrame with 112 rows and 3 columns
## Characteristics.organism. Characteristics.sex.
## <character> <character>
## cell1 Mus musculus female
## cell2 Mus musculus female
## cell3 Mus musculus male
## cell4 Mus musculus male
## cell5 Mus musculus female
## ... ... ...
## cell108 Mus musculus female
## cell109 Mus musculus male
## cell110 Mus musculus male
## cell111 Mus musculus female
## cell112 Mus musculus female
## Characteristics.cell.type.
## <character>
## cell1 8_cell_stage_single_..
## cell2 8_cell_stage_single_..
## cell3 8_cell_stage_single_..
## cell4 8_cell_stage_single_..
## cell5 8_cell_stage_single_..
## ... ...
## cell108 8_cell_stage_single_..
## cell109 8_cell_stage_single_..
## cell110 8_cell_stage_single_..
## cell111 8_cell_stage_single_..
## cell112 8_cell_stage_single_..
Take a peek at the sampleMap
:
sampleMap(gts)
## DataFrame with 224 rows and 3 columns
## assay primary colname
## <factor> <character> <character>
## 1 transcriptomic cell1 ERR861694
## 2 transcriptomic cell2 ERR861750
## 3 transcriptomic cell3 ERR861695
## 4 transcriptomic cell4 ERR861751
## 5 transcriptomic cell5 ERR861696
## ... ... ... ...
## 220 genomic cell108 ERR863164
## 221 genomic cell109 ERR863109
## 222 genomic cell110 ERR863165
## 223 genomic cell111 ERR863110
## 224 genomic cell112 ERR863166
To access the integer copy numbers as detected from scDNA-seq:
head(assay(gts, "genomic"))[, 1:4]
## ERR863111 ERR863834 ERR863112 ERR863835
## chr1:23000001-25500000 NA NA NA NA
## chr4:112000001-114500000 NA NA NA NA
## chr4:145000001-148500000 NA NA NA NA
## chr5:14000001-16500000 NA NA NA NA
## chr15:66500001-69000000 NA NA NA NA
## chrX:21500001-36000000 NA NA NA NA
To access raw read counts as quantified from scRNA-seq:
head(assay(gts, "transcriptomic"))[, 1:4]
## ERR861694 ERR861750 ERR861695 ERR861751
## ENSMUSG00000000001 4 7 30 32
## ENSMUSG00000000003 0 0 0 0
## ENSMUSG00000000028 11 17 79 94
## ENSMUSG00000000031 0 0 0 0
## ENSMUSG00000000037 0 0 1 0
## ENSMUSG00000000049 0 0 0 0
For protocol information, see Macaulay et al. (2016).
sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.17-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] SingleCellExperiment_1.22.0 RaggedExperiment_1.24.2
## [3] SingleCellMultiModal_1.12.3 MultiAssayExperiment_1.26.0
## [5] SummarizedExperiment_1.30.2 Biobase_2.60.0
## [7] GenomicRanges_1.52.0 GenomeInfoDb_1.36.2
## [9] IRanges_2.34.1 S4Vectors_0.38.1
## [11] BiocGenerics_0.46.0 MatrixGenerics_1.12.3
## [13] matrixStats_1.0.0 BiocStyle_2.28.0
##
## loaded via a namespace (and not attached):
## [1] DBI_1.1.3 bitops_1.0-7
## [3] formatR_1.14 rlang_1.1.1
## [5] magrittr_2.0.3 compiler_4.3.1
## [7] RSQLite_2.3.1 DelayedMatrixStats_1.22.6
## [9] png_0.1-8 vctrs_0.6.3
## [11] pkgconfig_2.0.3 SpatialExperiment_1.10.0
## [13] crayon_1.5.2 fastmap_1.1.1
## [15] magick_2.7.5 dbplyr_2.3.3
## [17] XVector_0.40.0 ellipsis_0.3.2
## [19] scuttle_1.10.2 utf8_1.2.3
## [21] promises_1.2.1 rmarkdown_2.24
## [23] purrr_1.0.2 bit_4.0.5
## [25] xfun_0.40 zlibbioc_1.46.0
## [27] cachem_1.0.8 beachmat_2.16.0
## [29] jsonlite_1.8.7 blob_1.2.4
## [31] later_1.3.1 rhdf5filters_1.12.1
## [33] DelayedArray_0.26.7 Rhdf5lib_1.22.0
## [35] BiocParallel_1.34.2 interactiveDisplayBase_1.38.0
## [37] parallel_4.3.1 R6_2.5.1
## [39] bslib_0.5.1 limma_3.56.2
## [41] jquerylib_0.1.4 Rcpp_1.0.11
## [43] bookdown_0.35 knitr_1.43
## [45] R.utils_2.12.2 BiocBaseUtils_1.2.0
## [47] httpuv_1.6.11 Matrix_1.6-1
## [49] tidyselect_1.2.0 abind_1.4-5
## [51] yaml_2.3.7 codetools_0.2-19
## [53] curl_5.0.2 lattice_0.21-8
## [55] tibble_3.2.1 withr_2.5.0
## [57] shiny_1.7.5 KEGGREST_1.40.0
## [59] evaluate_0.21 BiocFileCache_2.8.0
## [61] ExperimentHub_2.8.1 Biostrings_2.68.1
## [63] pillar_1.9.0 BiocManager_1.30.22
## [65] filelock_1.0.2 generics_0.1.3
## [67] RCurl_1.98-1.12 BiocVersion_3.17.1
## [69] sparseMatrixStats_1.12.2 xtable_1.8-4
## [71] glue_1.6.2 tools_4.3.1
## [73] AnnotationHub_3.8.0 locfit_1.5-9.8
## [75] rhdf5_2.44.0 grid_4.3.1
## [77] DropletUtils_1.20.0 AnnotationDbi_1.62.2
## [79] edgeR_3.42.4 GenomeInfoDbData_1.2.10
## [81] HDF5Array_1.28.1 cli_3.6.1
## [83] rappdirs_0.3.3 fansi_1.0.4
## [85] S4Arrays_1.0.6 dplyr_1.1.3
## [87] R.methodsS3_1.8.2 sass_0.4.7
## [89] digest_0.6.33 dqrng_0.3.1
## [91] rjson_0.2.21 memoise_2.0.1
## [93] htmltools_0.5.6 R.oo_1.25.0
## [95] lifecycle_1.0.3 httr_1.4.7
## [97] mime_0.12 bit64_4.0.5
Macaulay, Iain C, Wilfried Haerty, Parveen Kumar, Yang I Li, Tim Xiaoming Hu, Mabel J Teng, Mubeen Goolam, et al. 2015. “G&T-seq: Parallel Sequencing of Single-Cell Genomes and Transcriptomes.” Nat. Methods 12 (6): 519–22.
Macaulay, Iain C, Mabel J Teng, Wilfried Haerty, Parveen Kumar, Chris P Ponting, and Thierry Voet. 2016. “Separation and Parallel Sequencing of the Genomes and Transcriptomes of Single Cells Using G&T-seq.” Nat. Protoc. 11 (11): 2081–2103.