1 Loading the data

To retrieve a dataset, we can use a dataset’s corresponding named function <id>(), where <id> should correspond to one a valid dataset identifier (see ?VectraPolarisData). Below both the lung and ovarian cancer datasets are loaded this way.

spe_lung <- HumanLungCancerV3()
spe_ovarian <- HumanOvarianCancerVP()

Alternatively, data can loaded directly from Bioconductor’s ExperimentHub as follows. First, we initialize a hub instance and store the complete list of records in a variable eh. Using query(), we then identify any records made available by the VectraPolarisData package, as well as their accession IDs (EH7311 for the lung cancer data). Finally, we can load the data into R via eh[[id]], where id corresponds to the data entry’s identifier we’d like to load. E.g.:

eh <- ExperimentHub()        # initialize hub instance
q <- query(eh, "VectraPolarisData") # retrieve 'VectraPolarisData' records
id <- q$ah_id[1]             # specify dataset ID to load
spe <- eh[[id]]              # load specified dataset

2 Data Representation

Both the HumanLungCancerV3() and HumanOvarianCancerVP() datasets are stored as SpatialExperiment objects. This allows users of our data to interact with methods built for SingleCellExperiment, SummarizedExperiment, and SpatialExperiment class methods in Bioconductor. See this ebook for more details on SpatialExperiment. To get cell level tabular data that can be stored in this format, raw multiplex.tiff images have been preprocessed, segmented and cell phenotyped using Inform software from Akoya Biosciences.

The SpatialExperiment class was originally built for spatial transcriptomics data and follows the structure depicted in the schematic below (Righelli et al. 2021):