When performing statistical analysis on any set of genomic ranges it is often important to compare focal sets to null sets that are carefully matched for possible covariates that may influence the analysis. To address this need, the `nullranges`

package implements `matchRanges()`

, an efficient and convenient tool for selecting a covariate-matched set of null hypothesis ranges from a pool of background ranges within the Bioconductor framework.

In this vignette, we provide an overview of `matchRanges()`

and its associated functions. We start with a simulated example generated with the utility function `makeExampleMatchedDataSet()`

. We also provide an overview of the class struture and a guide for choosing among the supported matching methods. To see `matchedRanges()`

used in real biological examples, visit the Case study I: CTCF occupancy, and Case study II: CTCF orientation vignettes.

For a description of the method, see Davis et al. (2022).

`matchRanges`

references four sets of data: `focal`

, `pool`

, `matched`

and `unmatched`

. The `focal`

set contains the outcome of interest (`Y=1`

) while the `pool`

set contains all other observations (`Y=0`

). `matchRanges`

generates the `matched`

set, which is a subset of the `pool`

that is matched for provided covariates (i.e.Â `covar`

) but does not contain the outcome of interest (i.e `Y=0`

). Finally, the `unmatched`

set contains the remaining unselected elements from the `pool`

. The diagram below depicts the relationships between the four sets.