1 Introduction

Expert humans use flowJo software to manually gate FCS data files either individually or by setting a static gate to apply on all the files. The former is very tedious specially when there is a large number of files and the cost for the latter is to ignore characteristics of individual samples.

flowDensity is a supervised clustering algorithm based on density estimation techniques designed specifically to overcome these problems. It automates the current practice of manual 2D gating and adjusts the gates for each FCS data file individually.

Although automated flow cytometry methods developed to date have focused on fully automated analysis which is especially suited for discovery, they seldom match manual results where this is desirable (e.g., for diagnosis). In contrast, flowDensity aims to gate predefined cell populations of interest where the gating strategy, i.e., sequence of gates, is known. This helps it take advantage of expert knowledge and as a result it often matches manual results very well. In addition, since flowDensity uses only two dimensions at a time, it is very fast and requires mush less computational power.

2 How to use flowDensity?

In order to use flowDensity, the gating strategy is required. A gating strategy here means the sequence of 2D gates needed to apply one at a time on a FCS file to eventually extract the cell subset of interest.

A 2D gate consists of two channels (dimensions) or equivalently a phenotype with two markers. In addition, the corresponding expression level for each channel is given. For example, phenotype CD19+CD20- has markers CD19 and CD20 with expression values positive and negative, respectively.

To use flowDensity, this 2D gate is input to the function flowDensity(.). The channels in the gate are used for the channels argument and the expression values are used for the position argument of the function.

Let assume for example that CD19 is on channel PerCP-Cy5-5-A and CD20 is on channel APC-H7-A. Therefore, the corresponding input arguments are:

channels=c("PerCP-Cy5-5-A", "APC-H7-A") and position=c(TRUE,FALSE).

In general, channels argument can be set using either names of the channels or their corresponding indices (column numbers in the FCS file) and position argument could be one of the four logical pairs (TRUE,FALSE), (FALSE,TRUE), (FALSE,FALSE) and (TRUE,TRUE). If the user needs to set the thresholds for only one of the channels, then position for the other channel must be set to NA.

In addition to the above arguments, cell.population, gatingHierarchy or flow.frame argument is required where the former is an object of class CellPopulation loaded from flowDensity namespace and the latter is a flowFrame object loaded from flowCore namespace. It is also possible to provide the polygon filter. In this case position can be set to anything, and the filter should be a data.frame or matrix where the columns match with the FCS file channels.

3 Examples

In this section we present several examples to elaborate how to use the flowDensity(.) function.

3.1 Extracting Bcell

This example shows how to use flowDensity to extract B cells by using the gating strategy Singlet/viability-CD3-/CD19+CD20+ or equivalently singlets/Bcell.

library(flowCore)
library(flowDensity)
## Warning: replacing previous import 'flowCore::plot' by 'graphics::plot' when
## loading 'flowDensity'
data_dir <- system.file("extdata", package = "flowDensity")
load(list.files(pattern = 'sampleFCS_1', data_dir, full = TRUE))
f
## flowFrame object ''
## with 23000 cells and 13 observables:
##          name   desc     range    minRange  maxRange
## $P1     FSC-A     NA    262144 -111.000000  262143.0
## $P2     FSC-H     NA    262144    0.000000  262143.0
## $P3     SSC-A     NA    262144 -111.000000  262143.0
## $P4     SSC-H     NA    262144    0.000000  262143.0
## $P5     APC-A   CD38    262144   -0.044968       4.5
## ...       ...    ...       ...         ...       ...
## $P9    V450-A    CD3    262144   0.1590741       4.5
## $P10   V500-A    IgD    262144   0.2276837       4.5
## $P11     PE-A   CD24    262144  -0.0510186       4.5
## $P12 PE-Cy7-A   CD27    262144  -0.2319877       4.5
## $P13     Time     NA    262144   0.0000000  262143.0
## 211 keywords are stored in the 'description' slot
sngl <- flowDensity(f,channels = c("FSC-A","FSC-H"),position = c(F,F),
                    percentile =c(.99999,.99999),use.percentile = c(T,T),
                    ellip.gate = T,scale = .99 )
plotDens(f,c(1,2))
lines(sngl@filter,type="l")