Contents

Introduction

The XeniumIO package provides functions to import 10X Genomics Xenium Analyzer data into R. The package is designed to work with the output of the Xenium Analyzer, which is a software tool that processes Visium spatial gene expression data. The package provides functions to import the output of the Xenium Analyzer into R, and to create a TENxXenium object that can be used with other Bioconductor packages.

Supported Formats

TENxIO

The 10X suite of packages support multiple file formats. The following table lists the supported file formats and the corresponding classes that are imported into R.

Extension Class Imported as
.h5 TENxH5 SingleCellExperiment w/ TENxMatrix
.mtx / .mtx.gz TENxMTX SummarizedExperiment w/ dgCMatrix
.tar.gz TENxFileList SingleCellExperiment w/ dgCMatrix
peak_annotation.tsv TENxPeaks GRanges
fragments.tsv.gz TENxFragments RaggedExperiment
.tsv / .tsv.gz TENxTSV tibble

VisiumIO

Extension Class Imported as
spatial.tar.gz TENxSpatialList DataFrame list *
.parquet TENxSpatialParquet tibble *

XeniumIO

Extension Class Imported as
.zarr.zip TENxZarr (TBD)

GitHub Installation

BiocManager::install("Bioconductor/XeniumIO")

Loading package

library(XeniumIO)

XeniumIO

The TENxXenium class has a metadata slot for the experiment.xenium file. The resources slot is a TENxFileList or TENxH5 object containing the cell feature matrix. The coordNames slot is a vector specifying the names of the columns in the spatial data containing the spatial coordinates. The sampleId slot is a scalar specifying the sample identifier.

TENxXenium(
    resources = "path/to/matrix/folder/or/file",
    xeniumOut = "path/to/xeniumOut/folder",
    sample_id = "sample01",
    format = c("mtx", "h5"),
    boundaries_format = c("parquet", "csv.gz"),
    spatialCoordsNames = c("x_centroid", "y_centroid"),
    ...
)

The format argument specifies the format of the resources object, either “mtx” or “h5”. The boundaries_format allows the user to choose whether to read in the data using the parquet or csv.gz format.

Example Folder Structure

Note that the xeniumOut unzipped folder must contain the following files:

    *outs
    ├── cell_feature_matrix.h5
    ├── cell_feature_matrix.tar.gz
    |   ├── barcodes.tsv*
    |   ├── features.tsv*
    |   └── matrix.mtx*
    ├── cell_feature_matrix.zarr.zip
    ├── experiment.xenium
    ├── cells.csv.gz
    ├── cells.parquet
    ├── cells.zarr.zip
    [...]

Note that currently the zarr format is not supported as the infrastructure is currently under development.

Xenium class

The resources slot should either be the TENxFileList from the mtx format or a TENxH5 instance from an h5 file. The boundaries can either be a TENxSpatialParquet instance or a TENxSpatialCSV. These classes are automatically instantiated by the constructor function.

showClass("TENxXenium")
## Class "TENxXenium" [package "XeniumIO"]
## 
## Slots:
##                                            
## Name:                             resources
## Class:               TENxFileList_OR_TENxH5
##                                            
## Name:                            boundaries
## Class: TENxSpatialParquet_OR_TENxSpatialCSV
##                                            
## Name:                            coordNames
## Class:                            character
##                                            
## Name:                              sampleId
## Class:                            character
##                                            
## Name:                               colData
## Class:                   TENxSpatialParquet
##                                            
## Name:                              metadata
## Class:                           XeniumFile

import method

The import method for a TENxXenium instance returns a SpatialExperiment class object. Dispatch is only done on the con argument. See ?BiocIO::import for details on the generic. The import function call is meant to be a simple call without much input. For more details in the package, see ?TENxXenium.

getMethod("import", c(con = "TENxXenium"))
## Method Definition:
## 
## function (con, format, text, ...) 
## {
##     sce <- import(con@resources, ...)
##     metadata <- import(con@metadata)
##     coldata <- import(con@colData)
##     SpatialExperiment::SpatialExperiment(assays = list(counts = assay(sce)), 
##         rowData = rowData(sce), mainExpName = mainExpName(sce), 
##         altExps = altExps(sce), sample_id = con@sampleId, colData = as(coldata, 
##             "DataFrame"), spatialCoordsNames = con@coordNames, 
##         metadata = list(experiment.xenium = metadata, polygons = import(con@boundaries)))
## }
## <bytecode: 0x5eed6ca90658>
## <environment: namespace:XeniumIO>
## 
## Signatures:
##         con          format text 
## target  "TENxXenium" "ANY"  "ANY"
## defined "TENxXenium" "ANY"  "ANY"

Importing an Example Xenium Dataset

The following code snippet demonstrates how to import a Xenium Analyzer output into R. The TENxXenium object is created by specifying the path to the xeniumOut folder. The TENxXenium object is then imported into R using the import method for the TENxXenium class.

First, we cache the ~12 MB file to avoid downloading it multiple times (via BiocFileCache).

zipfile <- paste0(
    "https://mghp.osn.xsede.org/bir190004-bucket01/BiocXenDemo/",
    "Xenium_Prime_MultiCellSeg_Mouse_Ileum_tiny_outs.zip"
)
destfile <- XeniumIO:::.cache_url_file(zipfile)
## adding rname 'https://mghp.osn.xsede.org/bir190004-bucket01/BiocXenDemo/Xenium_Prime_MultiCellSeg_Mouse_Ileum_tiny_outs.zip'

We then create an output folder for the contents of the zipped file. We use the same name as the zip file but without the extension (with tools::file_path_sans_ext).

outfold <- file.path(
    tempdir(), tools::file_path_sans_ext(basename(zipfile))
)
if (!dir.exists(outfold))
    dir.create(outfold, recursive = TRUE)

We now unzip the file and use the outfold as the exdir argument to unzip. The outfold variable and folder will be used as the xeniumOut argument in the TENxXenium constructor. Note that we use the ref = "Gene Expression" argument in the import method to pass down to the internal splitAltExps function call. This will set the mainExpName in the SpatialExperiment object.

unzip(
    zipfile = destfile, exdir = outfold, overwrite = FALSE
)
TENxXenium(xeniumOut = outfold) |>
    import(ref = "Gene Expression")
## class: SpatialExperiment 
## dim: 5006 36 
## metadata(2): experiment.xenium polygons
## assays(1): counts
## rownames(5006): ENSMUSG00000052595 ENSMUSG00000030111 ...
##   ENSMUSG00000055670 ENSMUSG00000027596
## rowData names(3): ID Symbol Type
## colnames(36): aaamobki-1 aaclkaod-1 ... olbjkpjc-1 omjmdimk-1
## colData names(13): cell_id transcript_counts ... segmentation_method
##   sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(5): Deprecated Codeword Genomic Control Negative Control
##   Codeword Negative Control Probe Unassigned Codeword
## spatialCoords names(2) : x_centroid y_centroid
## imgData names(0):

Note that you may also use the swapAltExp function to set a mainExpName in the SpatialExperiment but this is not recommended. The operation returns a SingleCellExperiment which has to be coerced back into a SpatialExperiment. The coercion also loses some metadata information particularly the spatialCoords.

TENxXenium(xeniumOut = outfold) |>
    import() |>
    swapAltExp(name = "Gene Expression") |>
    as("SpatialExperiment")
## class: SpatialExperiment 
## dim: 5006 36 
## metadata(1): TENxFileList
## assays(1): counts
## rownames(5006): ENSMUSG00000052595 ENSMUSG00000030111 ...
##   ENSMUSG00000055670 ENSMUSG00000027596
## rowData names(3): ID Symbol Type
## colnames(36): aaamobki-1 aaclkaod-1 ... olbjkpjc-1 omjmdimk-1
## colData names(13): cell_id transcript_counts ... segmentation_method
##   sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(5): Genomic Control Negative Control Codeword Negative
##   Control Probe Unassigned Codeword Deprecated Codeword
## spatialCoords names(0) :
## imgData names(0):

The dataset was obtained from the 10X Genomics website under the X0A v3.0 section and is a subset of the Xenium Prime 5K Mouse Pan Tissue & Pathways Panel. The link to the data can be seen as the url input above and shown below for completeness.

https://cf.10xgenomics.com/samples/xenium/3.0.0/Xenium_Prime_MultiCellSeg_Mouse_Ileum_tiny/Xenium_Prime_MultiCellSeg_Mouse_Ileum_tiny_outs.zip

Session Info

sessionInfo()
## R Under development (unstable) (2025-01-20 r87609)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] XeniumIO_0.99.8             TENxIO_1.9.3               
##  [3] SingleCellExperiment_1.29.1 SummarizedExperiment_1.37.0
##  [5] Biobase_2.67.0              GenomicRanges_1.59.1       
##  [7] GenomeInfoDb_1.43.4         IRanges_2.41.3             
##  [9] S4Vectors_0.45.4            BiocGenerics_0.53.6        
## [11] generics_0.1.3              MatrixGenerics_1.19.1      
## [13] matrixStats_1.5.0           BiocStyle_2.35.0           
## 
## loaded via a namespace (and not attached):
##  [1] rjson_0.2.23             xfun_0.51                bslib_0.9.0             
##  [4] lattice_0.22-6           tzdb_0.4.0               vctrs_0.6.5             
##  [7] tools_4.5.0              parallel_4.5.0           curl_6.2.1              
## [10] tibble_3.2.1             RSQLite_2.3.9            blob_1.2.4              
## [13] pkgconfig_2.0.3          BiocBaseUtils_1.9.0      Matrix_1.7-2            
## [16] dbplyr_2.5.0             assertthat_0.2.1         lifecycle_1.0.4         
## [19] GenomeInfoDbData_1.2.13  compiler_4.5.0           codetools_0.2-20        
## [22] htmltools_0.5.8.1        sass_0.4.9               yaml_2.3.10             
## [25] pillar_1.10.1            crayon_1.5.3             jquerylib_0.1.4         
## [28] DelayedArray_0.33.6      cachem_1.1.0             magick_2.8.5            
## [31] abind_1.4-8              tidyselect_1.2.1         digest_0.6.37           
## [34] purrr_1.0.4              dplyr_1.1.4              bookdown_0.42           
## [37] arrow_18.1.0.1           fastmap_1.2.0            grid_4.5.0              
## [40] archive_1.1.11           cli_3.6.4                SparseArray_1.7.6       
## [43] magrittr_2.0.3           S4Arrays_1.7.3           withr_3.0.2             
## [46] readr_2.1.5              filelock_1.0.3           UCSC.utils_1.3.1        
## [49] bit64_4.6.0-1            rmarkdown_2.29           XVector_0.47.2          
## [52] httr_1.4.7               bit_4.5.0.1              hms_1.1.3               
## [55] SpatialExperiment_1.17.0 memoise_2.0.1            evaluate_1.0.3          
## [58] knitr_1.49               BiocIO_1.17.1            BiocFileCache_2.15.1    
## [61] rlang_1.1.5              Rcpp_1.0.14              glue_1.8.0              
## [64] DBI_1.2.3                BiocManager_1.30.25      VisiumIO_1.3.5          
## [67] vroom_1.6.5              jsonlite_1.9.0           R6_2.6.1