Contents

1 Preliminaries

CONFESS is a customized cell detection and signal estimation model for images coming from the Fluidigm C1 system. Applied to the HeLa CONFESSdata dataset, our method estimated the cell cycle phase of hundreds of samples from their fluorescence signals and enabled us to study the spatio-temporal dynamics of the HeLa cell cycle.

1.1 Loading the packages

To load the CONFESS package:

library(CONFESS)

1.2 Data pre-processing

The sample set of 14 raw C01 images are available in the CONFESSdata package on Bioconductor. Alternatively, the complete set of 378 images can be downloaded here. They include images for each of the following sets: - Bright Field (BF) images RawSC_BF.zip - Red & Green Channels (Ch) images RawSC_red_green.zip

CONFESS can take as input raw BMP or JPEG image files, or text-converted files. C01 files can be coverted with an external program like ImageJ (Fiji).

To do the conversion with Fiji, go to ImageJ (Fiji): Process –> Batch –> Convert, with the following options: “Output format”=Text Image. “Interpolation”=Bilinear and “scale factor”=1. Select the option “Read Images using Bio-Formats” and “convert”. You should get all txt files.

2 Fluorescence estimation

2.1 Reading in image/text files

The function readFiles reads the image/text filenames. If image data is used, all images should be in a single directory referenced by iDirectory. BFdirectory and CHdirectory would then reference the output directory for the Bright Field and Channel images. If the image files have been converted to text, iDirectory can be left empty, and BFdirectory and CHdirectory will now point to the input text files folders. These data should be stored in two different folders. This function will also report (and discard) any inconsistencies in the files being read (eg. BF file present but missing Red/Green channel). separator separates the image type (BF and channel characteristic types defined in image.type) from the rest of the sample name ID (consisting of the run ID and the well ID). A typical example of a sample name that is separated by “" from the image type is the .C01 Bright Field image "1772-062-248_A01_BF.C01". String "1772-062-248_A01" is the joined Run and Well ID (also separated by "”). At this function though separator refers to the one separating “1772-062-248_A01” and “BF.C01” strings.

In this example, we read text-converted files available in the CONFESSdata package.

library(CONFESSdata)
data_path<-system.file("extdata",package="CONFESSdata")
files<-readFiles(iDirectory=NULL,
                  BFdirectory=paste(data_path,"/BF",sep=""),
                  CHdirectory=paste(data_path,"/CH",sep=""),
                  separator = "_",image.type = c("BF","Green","Red"),
                  bits=2^16)

2.2 Image spot estimation

To estimate the spots we need to specify a set of parameters. correctionAlgorithm should be FALSE in this estimation stage. If the parameter subset is not defined, all files read in with the readFiles function will be analysed. foregroundCut defines a series of cut-offs that separate the spot (a potential cell) from the background. The cut-offs are empirically picked. For this reason, it is often helpful to train the dataset by picking a subset of well-defined, single-spot images and check the algorithm’s performance using different values (e.g. the above vs seq(0.8,0.96,0.02)). In noisy data we have found that low cut-offs produce the best results. If the spot’s fluorescence signal is too weak to be detected or simply the cell is not present, CONFESS will perform capture site recognition (BF modeling) to estimate the pixel coordinates of the spot that, here, it is assumed to have a rectangular shape. The size of the side of this pseudo-spot is defined by BFarea. The signal is then quantified within this area. Note that only in BF modeling the spot is assumed to have a rectangular shape. Otherwise we do not make any assumptions on the shape and the size of the spot.

estimates <- spotEstimator(files=files,foregroundCut=seq(0.6,0.76,0.02),
                        BFarea=7,correctionAlgorithm=FALSE,savePlot="screen")

2.3 Quality control (identification of outliers)

The next step uses visual and statistical inspection tools for the identification of possible outliers. The spot estimates (of the samples stored in estimates) enter in the function through defineLocClusters.

The way of processing these data is controlled by out.method whose possible values are:

  • interactive.clustering : estimates concentric circles that mark the outliers (e.g. all dots outside the circle with with a particular radius are outliers)
  • interactive.manual : enables the user to select the outliers manually by point-and-click on the plot.

The function mainly produces run- and well- specific plots that enable the user to pick outlier locations. We have noticed that the Well IDs exhibit specific directionality (half of them are facing right and the rest left) that affects the position of the capture site. The output integrates the first-step estimates and the quality control. The code below shows the quality control process under interactive.manual. interactive.clustering requires at least 15 samples in each run- / well- category (it will exit with an error here). When applied on all data, it enables the user to select outliers by entering a pre-calculated radius in an auto-generated message similar to the one below.

clu <- defineLocClusters(LocData=estimates,out.method="interactive.manual")
#"Hit Enter to move to the next image or A + Enter to Abort:"

Option interactive.manual enables the user to select suspicious data manually by point-and-click on the plot. The algorithm will select the closest spot only once of the location that has been selected by the point-and-click (thus in closely located spots one can click as many times as he wants around to make sure that everything is selected). In Windows the user clicks on any number of spots that could be potential outliers and completes the procedure by clicking on the ‘Stop’ button appearing on the top/left of the console. In Linux/Mac the ‘Stop’ button is replaced by right clicking anywhere on the image. In Rstudio the process stops with the ‘Esc’ button.

Important Note: Please follow the pipeline’s instructions on how to stop this process at any time you wish. Closing the plot window may cause a fatal error and abnormal exit. To stop the process simply press A and Enter (Abort) when prompted!

2.4 Re-estimation step (for outliers)

The selected samples now undergo BF modelling using spotEstimator with correctionAlgorithm = TRUE. The potential outliers that had been originally estimated by BF modelling are not re-estimated. They are only kept in a separate slot for manual inspection by our graphical tools. QCdata contains the output of the quality control step. median.correction=TRUE instructs the function to shift all locations with outlier BF modeling estimates (more than cutoff=50 pixels away from the bulk of the estimates) to the median of the bulk estimates (denoted as confidence in the output table).

estimates.2 <- spotEstimator(files=files,subset=clu$Outlier.indices,
                             foregroundCut=seq(0.6,0.76,0.02),correctionAlgorithm=TRUE,
                             cutoff=50,QCdata=clu,median.correction=TRUE,
                             savePlot="screen")