learn_model {segmenter}R Documentation

Learn a multi-state model from chromatin data

Description

Integrate multiple ChIP-seq chromatin datasets of histone modifications, transcription factors or other DNA binding proteins to build a multi-state model of the combinatorial and spatial frequently occurring patterns. The function uses as an input binarized ChIP-seq data and the genome annotations on which the states will be discovered.

Usage

learn_model(
  inputdir,
  outputdir,
  numstates,
  coordsdir,
  anchorsdir,
  chromsizefile,
  assembly,
  cells,
  annotation,
  binsize,
  inputbamdir,
  cellmarkfiletable,
  read_only = FALSE,
  read_bins = FALSE,
  counts = FALSE
)

Arguments

inputdir

A string. The path to binarized files.

outputdir

A string. The path to a directory where output will be written.

numstates

An integer. The number of desired states in the model.

coordsdir

A string. The path to genomic coordinates files.

anchorsdir

A string. The path to the genomic anchors files.

chromsizefile

A string. The path to the chromosomes sizes file.

assembly

A string. The name of the genomic assembely.

cells

A character vector. The names of the cells as they occur in the binarized files (first line).

annotation

A string. The name of the type of annotation as it occurs in the genomic annotation files.

binsize

An integer. The number in bp used to generate binarized files.

inputbamdir

A string. The path to the input bam files. Only used when count = TRUE.

cellmarkfiletable

A string. The path to the input files table. Only used when bins = TRUE.

read_only

A logical. Default is FALSE. Whether to look for and load output files or generate the model from scratch.

read_bins

A logical. Default is FALSE. Whether to load the binarized data into the output object.

counts

A logical. Default is FALSE. Whether to load the reads counts in bins data into the output object.

Details

By default, this functions runs the analysis commands, writes the output to files and loads it into an object of class segmentation. In addition, the binarized data and the reads counts in the bins can be loaded. When read_only is TRUE. The functions looks for previously generated files in the output directory and load them without rerunning the commands.

Value

An object of class segmentation (see for details) and the files written to the output directory.

See Also

LearnModel

Examples

# locate input and output files
inputdir <- system.file('extdata/SAMPLEDATA_HG18',
                        package = 'segmenter')
outputdir <- tempdir()
coordsdir <- system.file('extdata/COORDS',
                         package = 'chromhmmData')
anchorsdir <- system.file('extdata/ANCHORFILES',
                          package = 'chromhmmData')
chromsizefile <- system.file('extdata/CHROMSIZES',
                             'hg18.txt',
                             package = 'chromhmmData')

# run command
obj <- learn_model(inputdir = inputdir,
                   outputdir = outputdir,
                   coordsdir = coordsdir,
                   anchorsdir = anchorsdir,
                   chromsizefile = chromsizefile,
                   numstates = 3,
                   assembly = 'hg18',
                   cells = c('K562', 'GM12878'),
                   annotation = 'RefSeq',
                   binsize = 200)

# show the output
obj


[Package segmenter version 0.99.14 Index]