Contents

1 Installation

if (!require("BiocManager"))
    install.packages("BiocManager")
BiocManager::install("lisaClust")
# load required packages
library(lisaClust)
library(spicyR)
library(ggplot2)
library(SingleCellExperiment)

2 Overview

Clustering local indicators of spatial association (LISA) functions is a methodology for identifying consistent spatial organisation of multiple cell-types in an unsupervised way. This can be used to enable the characterization of interactions between multiple cell-types simultaneously and can complement traditional pairwise analysis. In our implementation our LISA curves are a localised summary of an L-function from a Poisson point process model. Our framework lisaClust can be used to provide a high-level summary of cell-type colocalization in high-parameter spatial cytometry data, facilitating the identification of distinct tissue compartments or identification of complex cellular microenvironments.

3 Quick start

3.1 Generate toy data

TO illustrate our lisaClust framework, here we consider a very simple toy example where two cell-types are completely separated spatially. We simulate data for two different images.

set.seed(51773)
x <- round(c(runif(200),runif(200)+1,runif(200)+2,runif(200)+3,
           runif(200)+3,runif(200)+2,runif(200)+1,runif(200)),4)*100
y <- round(c(runif(200),runif(200)+1,runif(200)+2,runif(200)+3,
             runif(200),runif(200)+1,runif(200)+2,runif(200)+3),4)*100
cellType <- factor(paste('c',rep(rep(c(1:2),rep(200,2)),4),sep = ''))
imageID <- rep(c('s1', 's2'),c(800,800))

cells <- data.frame(x, y, cellType, imageID)

ggplot(cells, aes(x,y, colour = cellType)) + geom_point() + facet_wrap(~imageID)

3.2 Create SegmentedCellExperiment object

First we store our data in a SegmentedCells object.


cellExp <- SegmentedCells(cells, cellTypeString = 'cellType')

3.3 Running lisaCLust

We can then use a convience function lisaClust to simultaneously calculate local indicators of spatial association (LISA) functions using the lisa function and perform k-means clustering.

cellExp <- lisaClust(cellExp, k = 2)

3.4 Plot identified regions

The hatchingPlot function can be used to construct a ggplot object where the regions are marked by different hatching patterns. This allows us to plot both regions and cell-types on the same visualization.

hatchingPlot(cellExp, useImages = c('s1','s2'))

3.5 Using other clustering methods.

While the lisaClust function is convenient, we have not implemented an exhaustive suite of clustering methods as it is very easy to do this yourself. There are just two simple steps.

3.5.1 Generate LISA curves

We can calculate local indicators of spatial association (LISA) functions using the lisa function. Here the LISA curves are a localised summary of an L-function from a Poisson point process model. The radii that will be calculated over can be set with Rs.


lisaCurves <- lisa(cellExp, Rs = c(20, 50, 100))

3.5.2 Perform some clustering

The LISA curves can then be used to cluster the cells. Here we use k-means clustering, other clustering methods like SOM could be used. We can store these cell clusters or cell “regions” in our SegmentedCells object using the cellAnnotation() <- function.


kM <- kmeans(lisaCurves,2)
cellAnnotation(cellExp, "region") <- paste('region',kM$cluster,sep = '_')

3.6 Alternative hatching plot

We could also create this plot using geom_hatching and scale_region_manual.


df <- as.data.frame(cellSummary(cellExp))

p <- ggplot(df,aes(x = x,y = y, colour = cellType, region = region)) + 
  geom_point() + 
  facet_wrap(~imageID) +
  geom_hatching(window = "concave", 
                line.spacing = 11, 
                nbp = 50, 
                line.width = 2, 
                hatching.colour = "gray20",
                window.length = NULL) +
  theme_minimal() + 
  scale_region_manual(values = 6:7, labels = c('ab','cd'))

p

## Faster ploting

The hatchingPlot can be quite slow for large images and high nbp or linewidth. It is often useful to simply plot the regions without the cell type information.


df <- as.data.frame(cellSummary(cellExp))
df <- df[df$imageID == "s1", ]

p <- ggplot(df,aes(x = x,y = y, colour  = region)) + 
  geom_point() +
  theme_classic()
p

4 Using a SingleCellExperiment

The lisaClust function also works with a SingleCellExperiment. First lets create a SingleCellExperiment object.


sce <- SingleCellExperiment(colData = cellSummary(cellExp))

lisaClust just needs columns in colData corresponding to the x and y coordinates of the cells, a column annotating the cell types of the cells and a column indicating which image each cell came from.

sce <- lisaClust(sce, 
                 k = 2, 
                 spatialCoords = c("x", "y"), 
                 cellType = "cellType",
                 imageID = "imageID")

We can then plot the regions using the following.


hatchingPlot(sce)

5 Damond et al. islet data.

Here we apply our lisaClust framework to three images of pancreatic islets from A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry by Damond et al. (2019).

5.1 Read in data

We will start by reading in the data and storing it as a SegmentedCells object. Here the data is in a format consistent with that outputted by CellProfiler.

isletFile <- system.file("extdata","isletCells.txt.gz", package = "spicyR")
cells <- read.table(isletFile, header = TRUE)
cellExp <- SegmentedCells(cells, cellProfiler = TRUE)

5.2 Cluster cell-types

This data does not include annotation of the cell-types of each cell. Here we extract the marker intensities from the SegmentedCells object using cellMarks. We then perform k-means clustering with eight clusters and store these cell-type clusters in our SegmentedCells object using cellType() <-.

markers <- cellMarks(cellExp)
kM <- kmeans(markers,10)
cellType(cellExp) <- paste('cluster', kM$cluster, sep = '')

5.3 Generate LISA curves

As before, we can calculate perform k-means clustering on the local indicators of spatial association (LISA) functions using the lisaClust function.


cellExp <- lisaClust(cellExp, k = 2, Rs = c(10,20,50))

These regions are stored in cellExp and can be extracted.


cellAnnotation(cellExp, "region") |>
  head()
## [1] "region_1" "region_1" "region_1" "region_1" "region_1" "region_1"

5.4 Examine cell type enrichment

We should check to see which cell types appear more frequently in each region than expected by chance.

regionMap(cellExp, type = "bubble")

5.5 Plot identified regions

Finally, we can use hatchingPlot to construct a ggplot object where the regions are marked by different hatching patterns. This allows us to visualize the two regions and ten cell-types simultaneously.

hatchingPlot(cellExp)

6 sessionInfo()

sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.17-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2
##  [3] Biobase_2.60.0              GenomicRanges_1.52.0       
##  [5] GenomeInfoDb_1.36.4         IRanges_2.34.1             
##  [7] S4Vectors_0.38.2            BiocGenerics_0.46.0        
##  [9] MatrixGenerics_1.12.3       matrixStats_1.0.0          
## [11] ggplot2_3.4.3               spicyR_1.12.2              
## [13] lisaClust_1.8.2             BiocStyle_2.28.1           
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3          jsonlite_1.8.7             
##   [3] MultiAssayExperiment_1.26.0 magrittr_2.0.3             
##   [5] spatstat.utils_3.0-3        magick_2.8.0               
##   [7] farver_2.1.1                nloptr_2.0.3               
##   [9] rmarkdown_2.25              zlibbioc_1.46.0            
##  [11] vctrs_0.6.3                 minqa_1.2.6                
##  [13] spatstat.explore_3.2-3      DelayedMatrixStats_1.22.6  
##  [15] RCurl_1.98-1.12             rstatix_0.7.2              
##  [17] htmltools_0.5.6             S4Arrays_1.0.6             
##  [19] curl_5.1.0                  broom_1.0.5                
##  [21] Rhdf5lib_1.22.1             rhdf5_2.44.0               
##  [23] sass_0.4.7                  bslib_0.5.1                
##  [25] plyr_1.8.9                  cachem_1.0.8               
##  [27] lifecycle_1.0.3             pkgconfig_2.0.3            
##  [29] Matrix_1.6-1.1              R6_2.5.1                   
##  [31] fastmap_1.1.1               GenomeInfoDbData_1.2.10    
##  [33] digest_0.6.33               numDeriv_2016.8-1.1        
##  [35] colorspace_2.1-0            tensor_1.5                 
##  [37] dqrng_0.3.1                 ggpubr_0.6.0               
##  [39] beachmat_2.16.0             labeling_0.4.3             
##  [41] fansi_1.0.4                 spatstat.sparse_3.0-2      
##  [43] mgcv_1.9-0                  polyclip_1.10-6            
##  [45] abind_1.4-5                 compiler_4.3.1             
##  [47] withr_2.5.1                 backports_1.4.1            
##  [49] BiocParallel_1.34.2         carData_3.0-5              
##  [51] HDF5Array_1.28.1            ggforce_0.4.1              
##  [53] R.utils_2.12.2              ggsignif_0.6.4             
##  [55] MASS_7.3-60                 concaveman_1.1.0           
##  [57] DelayedArray_0.26.7         rjson_0.2.21               
##  [59] tools_4.3.1                 goftest_1.2-3              
##  [61] R.oo_1.25.0                 glue_1.6.2                 
##  [63] nlme_3.1-163                rhdf5filters_1.12.1        
##  [65] grid_4.3.1                  ClassifyR_3.4.11           
##  [67] reshape2_1.4.4              generics_0.1.3             
##  [69] gtable_0.3.4                spatstat.data_3.0-1        
##  [71] R.methodsS3_1.8.2           class_7.3-22               
##  [73] tidyr_1.3.0                 data.table_1.14.8          
##  [75] car_3.1-2                   utf8_1.2.3                 
##  [77] XVector_0.40.0              spatstat.geom_3.2-5        
##  [79] pillar_1.9.0                stringr_1.5.0              
##  [81] limma_3.56.2                splines_4.3.1              
##  [83] dplyr_1.1.3                 tweenr_2.0.2               
##  [85] lattice_0.21-9              survival_3.5-7             
##  [87] deldir_1.0-9                tidyselect_1.2.0           
##  [89] locfit_1.5-9.8              scuttle_1.10.2             
##  [91] knitr_1.44                  V8_4.3.3                   
##  [93] bookdown_0.35               edgeR_3.42.4               
##  [95] xfun_0.40                   DropletUtils_1.20.0        
##  [97] pheatmap_1.0.12             fftwtools_0.9-11           
##  [99] scam_1.2-14                 stringi_1.7.12             
## [101] yaml_2.3.7                  boot_1.3-28.1              
## [103] evaluate_0.22               codetools_0.2-19           
## [105] tibble_3.2.1                BiocManager_1.30.22        
## [107] cli_3.6.1                   munsell_0.5.0              
## [109] jquerylib_0.1.4             Rcpp_1.0.11                
## [111] spatstat.random_3.1-6       parallel_4.3.1             
## [113] sparseMatrixStats_1.12.2    bitops_1.0-7               
## [115] lme4_1.1-34                 SpatialExperiment_1.10.0   
## [117] lmerTest_3.1-3              scales_1.2.1               
## [119] purrr_1.0.2                 crayon_1.5.2               
## [121] rlang_1.1.1