When a large number of samples are being analyzed, it is desirable to have random access to specific CpG methylation without loading all the data. SeSAMe provides such interface through the `fileSet`

object which is in essence an indexed file-based numeric matrix.

The one function to generate a `fileSet`

is through the `openSesameToFile`

function. In this case, there is no concrete output from the function. The consequence is the generation of a file at the given path. One can operate on the `fileSet`

by referencing the path to the file.

The following `openSesameToFile`

call does three things - generates a file called `mybetas`

. - generates an index file called `mybetas_idx.rds`

- returns a `fileSet`

object which serves as an interface to the two files.

`## Allocating space for 2 HM27 samples at mybetas.`

`## Mapping 2 HM27 samples to mybetas.`

```
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to
## unique 'x' values
```

`## Successfully processed 2 IDATs (0 failed).`

When printed to console, the number of samples and the number of probes are shown.

`## File Set for 27578 probes and 2 samples.`

One can obtain the samples and probes information with the `$`

operator.

`## [1] "4207113116_A" "4207113116_B"`

```
## [1] "cg24054653" "cg07665060" "cg22501393" "cg18895155" "cg01333131"
## [6] "cg20557202"
```

One can query the specific CpG by probe name(s) and sample name(s). Note that every query to fset is a disk read. Therefore it can be slower than in-memory processing. Here we only retrieve the beta values for the two probes *cg00006414* and *cg00007981* in the sample *4207113116_B*.

```
## 4207113116_B
## cg00006414 0.1410441
## cg00007981 0.0253735
```

In the previous example, we preprocessed IDATs directly to `fileSet`

. We can also read a pre-existing `fileSet`

using the file path using `readFileSet`

function.

```
## 4207113116_A
## cg00000292 0.896829
```

`fileSet`

size is always fixed. One cannot dynamically expand or shrink a fileSet. We can write a fileSet by filling the space one sample by one sample. This is achieved by first allocating the space given the number of samples and the probe IDs (optional if platform is one if HM27, HM450 or EPIC).

`## Allocating space for 2 HM450 samples at mybetas2.`

Then one can fill in the beta values by `mapFileSet`

. Here I am illustrating using a randomly generated beta values.

```
hypothetical_betas <- setNames(runif(fset2$n), fset2$probes)
mapFileSet(fset2, 'sample2', hypothetical_betas)
```

`## File Set for 485577 probes and 2 samples.`

The mapped value should be equal to the generated beta value. Let’s spot-check.

```
## sample2
## cg00000108 TRUE
```