Introduction to the NanoStringGeomxSet Class

David Henderson, Patrick Aboyoun, Nicole Ortogero, Zhi Yang, Jason Reeves, Kara Gorman, Rona Vitancol, Thomas Smith, Maddy Griswold

2021-05-19

Introduction

The NanoStringGeomxSet was inherited from Biobase’s ExpressionSet class. The NanoStringGeomxSet class was designed to encapsulate data and corresponding methods for NanoString DCC files generated from the NanoString GeoMx Digital Spatial Profiling (DSP) platform.

Loading Packages

Loading the NanoStringNCTools and GeomxTools packages allow users access to the GeomxSet class and corresponding methods.

library(NanoStringNCTools)
library(GeomxTools)

Building a NanoStringGeomxSet from .DCC files

datadir <- system.file("extdata", "DSP_NGS_Example_Data",
                       package="GeomxTools")
DCCFiles <- dir(datadir, pattern=".dcc$", full.names=TRUE)
PKCFiles <- unzip(zipfile = file.path(datadir,  "/pkcs.zip"))
SampleAnnotationFile <- file.path(datadir, "annotations.xlsx")

demoData <-
  suppressWarnings(readNanoStringGeomxSet(dccFiles = DCCFiles,
                                          pkcFiles = PKCFiles,
                                          phenoDataFile = SampleAnnotationFile,
                                          phenoDataSheet = "CW005",
                                          phenoDataDccColName = "Sample_ID",
                                          protocolDataColNames = c("aoi",
                                                                   "cell_line",
                                                                   "roi_rep",
                                                                   "pool_rep",
                                                                   "slide_rep"),
                                          experimentDataColNames = c("panel")))
class( demoData )
#> [1] "NanoStringGeomxSet"
#> attr(,"package")
#> [1] "GeomxTools"
isS4( demoData )
#> [1] TRUE
is( demoData, "ExpressionSet" )
#> [1] TRUE
demoData
#> NanoStringGeomxSet (storageMode: lockedEnvironment)
#> assayData: 1821 features, 89 samples 
#>   element names: exprs 
#> protocolData
#>   sampleNames: DSP-1001250002642-A01.dcc DSP-1001250002642-A02.dcc ...
#>     DSP-1001250002642-H05.dcc (89 total)
#>   varLabels: FileVersion SoftwareVersion ... slide_rep (24 total)
#>   varMetadata: labelDescription
#> phenoData
#>   sampleNames: DSP-1001250002642-A01.dcc DSP-1001250002642-A02.dcc ...
#>     DSP-1001250002642-H05.dcc (89 total)
#>   varLabels: slide name scan name ... area (5 total)
#>   varMetadata: labelDescription
#> featureData
#>   featureNames: A2M ABCB1 ... ZNF207 (1821 total)
#>   fvarLabels: Gene
#>   fvarMetadata: labelDescription
#> experimentData: use 'experimentData(object)'
#> Annotation: Six-gene_test_v1_v1.1.pkc VnV_GeoMx_Hs_CTA_v1.2.pkc 
#> signature: none

Accessing and Assigning NanoStringGeomxSet Data Members

Alongside the accessors associated with the ExpressionSet class, NanoStringGeomxSet objects have unique additional assignment and accessor methods faciliting common ways to view DSP data and associated labels.

head( pData( demoData ), 2 )
#>                                              slide name
#> DSP-1001250002642-A01.dcc           No Template Control
#> DSP-1001250002642-A02.dcc 6panel-old-slide1 (PTL-10891)
#>                                          scan name roi           segment
#> DSP-1001250002642-A01.dcc                     <NA>  NA              <NA>
#> DSP-1001250002642-A02.dcc cw005 (PTL-10891) Slide1   1 Geometric Segment
#>                               area
#> DSP-1001250002642-A01.dcc       NA
#> DSP-1001250002642-A02.dcc 31318.73
protocolData( demoData )
#> An object of class 'AnnotatedDataFrame'
#>   sampleNames: DSP-1001250002642-A01.dcc DSP-1001250002642-A02.dcc ...
#>     DSP-1001250002642-H05.dcc (89 total)
#>   varLabels: FileVersion SoftwareVersion ... slide_rep (24 total)
#>   varMetadata: labelDescription
svarLabels( demoData )
#>  [1] "slide name"      "scan name"       "roi"             "segment"        
#>  [5] "area"            "FileVersion"     "SoftwareVersion" "Date"           
#>  [9] "ID"              "Plate_ID"        "Well"            "SeqSetId"       
#> [13] "tamperedIni"     "trimGaloreOpts"  "flash2Opts"      "umiExtractOpts" 
#> [17] "bowtie2Opts"     "umiDedupOpts"    "Raw"             "Trimmed"        
#> [21] "Stitched"        "Aligned"         "umiQ30"          "rtsQ30"         
#> [25] "aoi"             "cell_line"       "roi_rep"         "pool_rep"       
#> [29] "slide_rep"
head( sData(demoData), 2 )
#>                                              slide name
#> DSP-1001250002642-A01.dcc           No Template Control
#> DSP-1001250002642-A02.dcc 6panel-old-slide1 (PTL-10891)
#>                                          scan name roi           segment
#> DSP-1001250002642-A01.dcc                     <NA>  NA              <NA>
#> DSP-1001250002642-A02.dcc cw005 (PTL-10891) Slide1   1 Geometric Segment
#>                               area FileVersion SoftwareVersion       Date
#> DSP-1001250002642-A01.dcc       NA         0.1           1.0.0 2020-07-14
#> DSP-1001250002642-A02.dcc 31318.73         0.1           1.0.0 2020-07-14
#>                                              ID      Plate_ID Well
#> DSP-1001250002642-A01.dcc DSP-1001250002642-A01 1001250002642  A01
#> DSP-1001250002642-A02.dcc DSP-1001250002642-A02 1001250002642  A02
#>                                      SeqSetId tamperedIni
#> DSP-1001250002642-A01.dcc VH00121:3:AAAG2YWM5          No
#> DSP-1001250002642-A02.dcc VH00121:3:AAAG2YWM5          No
#>                                          trimGaloreOpts
#> DSP-1001250002642-A01.dcc " --hardtrim5 26 --dont_gzip"
#> DSP-1001250002642-A02.dcc " --hardtrim5 26 --dont_gzip"
#>                                                flash2Opts
#> DSP-1001250002642-A01.dcc " -m 26 -e 26 -f 26 -s 1 -r 27"
#> DSP-1001250002642-A02.dcc " -m 26 -e 26 -f 26 -s 1 -r 27"
#>                                           umiExtractOpts
#> DSP-1001250002642-A01.dcc " --bc-pattern=NNNNNNNNNNNNNN"
#> DSP-1001250002642-A02.dcc " --bc-pattern=NNNNNNNNNNNNNN"
#>                                                               bowtie2Opts
#> DSP-1001250002642-A01.dcc " --end-to-end -L 4 --trim5 0 --trim3 0 --norc"
#> DSP-1001250002642-A02.dcc " --end-to-end -L 4 --trim5 0 --trim3 0 --norc"
#>                                             umiDedupOpts    Raw Trimmed
#> DSP-1001250002642-A01.dcc " --edit-distance-threshold=1"    161     161
#> DSP-1001250002642-A02.dcc " --edit-distance-threshold=1" 646250  646250
#>                           Stitched Aligned umiQ30 rtsQ30
#> DSP-1001250002642-A01.dcc       12      12      0      0
#> DSP-1001250002642-A02.dcc   616150  610390      0      0
#>                                                 aoi cell_line roi_rep pool_rep
#> DSP-1001250002642-A01.dcc                      <NA>      <NA>      NA       NA
#> DSP-1001250002642-A02.dcc Geometric Segment-aoi-001    HS578T       1        1
#>                           slide_rep
#> DSP-1001250002642-A01.dcc        NA
#> DSP-1001250002642-A02.dcc         1

Design information can be assigned to the NanoStringGeomxSet object, as well as feature and sample labels to use for NanoStringGeomxSet plotting methods.

design( demoData ) <- ~ `segments`
design( demoData )
#> ~segments

dimLabels( demoData )
#> [1] "GeneName" "SampleID"
dimLabels( demoData )[2] <- "Sample ID"
dimLabels( demoData )
#> [1] "GeneName"  "Sample ID"
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
#> LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
#> [8] methods   base     
#> 
#> other attached packages:
#> [1] GeomxTools_1.0.0        NanoStringNCTools_1.0.0 ggplot2_3.3.3          
#> [4] S4Vectors_0.30.0        Biobase_2.52.0          BiocGenerics_0.38.0    
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.6             Biostrings_2.60.0      assertthat_0.2.1      
#>  [4] digest_0.6.27          utf8_1.2.1             plyr_1.8.6            
#>  [7] R6_2.5.0               GenomeInfoDb_1.28.0    cellranger_1.1.0      
#> [10] evaluate_0.14          pillar_1.6.1           zlibbioc_1.38.0       
#> [13] rlang_0.4.11           uuid_0.1-4             readxl_1.3.1          
#> [16] jquerylib_0.1.4        rmarkdown_2.8          EnvStats_2.4.0        
#> [19] stringr_1.4.0          htmlwidgets_1.5.3      pheatmap_1.0.12       
#> [22] RCurl_1.98-1.3         ggiraph_0.7.10         munsell_0.5.0         
#> [25] compiler_4.1.0         vipor_0.4.5            xfun_0.23             
#> [28] pkgconfig_2.0.3        systemfonts_1.0.2      ggbeeswarm_0.6.0      
#> [31] htmltools_0.5.1.1      tidyselect_1.1.1       tibble_3.1.2          
#> [34] GenomeInfoDbData_1.2.6 IRanges_2.26.0         fansi_0.4.2           
#> [37] crayon_1.4.1           dplyr_1.0.6            withr_2.4.2           
#> [40] bitops_1.0-7           grid_4.1.0             jsonlite_1.7.2        
#> [43] gtable_0.3.0           lifecycle_1.0.0        DBI_1.1.1             
#> [46] magrittr_2.0.1         scales_1.1.1           stringi_1.6.2         
#> [49] reshape2_1.4.4         XVector_0.32.0         ggthemes_4.2.4        
#> [52] bslib_0.2.5.1          ellipsis_0.3.2         generics_0.1.0        
#> [55] vctrs_0.3.8            rjson_0.2.20           RColorBrewer_1.1-2    
#> [58] tools_4.1.0            glue_1.4.2             beeswarm_0.3.1        
#> [61] purrr_0.3.4            yaml_2.2.1             colorspace_2.0-1      
#> [64] knitr_1.33             sass_0.4.0