coassignProb {scran}R Documentation

Compute coassignment probabilities

Description

Compute coassignment probabilities for each label in a reference grouping when compared to an alternative grouping of samples.

Usage

coassignProb(ref, alt, summarize = FALSE)

Arguments

ref

A character vector or factor containing one set of groupings, considered to be the reference.

alt

A character vector or factor containing another set of groupings.

summarize

Logical scalar indicating whether the output matrix should be converted into a per-label summary.

Details

The coassignment probability for each pair of labels in ref is the probability that a randomly chosen cell from each of the two reference labels will have the same label in alt. High coassignment probabilities indicate that a particular pair of labels in ref are frequently assigned to the same label in alt, which has some implications for cluster stability.

When summarize=TRUE, we summarize the matrix of coassignment probabilities into a set of per-label values. The “self” coassignment probability is simply the diagonal entry of the matrix, i.e., the probability that two cells from the same label in ref also have the same label in alt. The “other” coassignment probability is the maximum probability across all pairs involving that label.

In general, ref is well-recapitulated by alt if the diagonal entries of the matrix is much higher than the sum of the off-diagonal entries. This manifests as higher values for the self probabilities compared to the other probabilities.

Value

If summarize=FALSE, a numeric matrix is returned with upper triangular entries filled with the coassignment probabilities for each pair of labels in ref.

Otherwise, a DataFrame is returned with one row per label in ref containing the self and other coassignment probabilities.

Author(s)

Aaron Lun

See Also

bootstrapCluster, to compute coassignment probabilities across bootstrap replicates.

Examples

library(scater)
sce <- mockSCE(ncells=200)
sce <- logNormCounts(sce)

clust1 <- kmeans(t(logcounts(sce)),3)$cluster
clust2 <- kmeans(t(logcounts(sce)),5)$cluster

coassignProb(clust1, clust2)
coassignProb(clust1, clust2, summarize=TRUE)


[Package scran version 1.16.0 Index]