collapseBins {AneuFinder} | R Documentation |
The function will collapse consecutive bins which have, for example, the same combinatorial state.
collapseBins(data, column2collapseBy = NULL, columns2sumUp = NULL, columns2average = NULL, columns2getMax = NULL, columns2drop = NULL)
data |
A data.frame containing the genomic coordinates in the first three columns. |
column2collapseBy |
The number of the column which will be used to collapse all other inputs. If a set of consecutive bins has the same value in this column, they will be aggregated into one bin with adjusted genomic coordinates. If |
columns2sumUp |
Column numbers that will be summed during the aggregation process. |
columns2average |
Column numbers that will be averaged during the aggregation process. |
columns2getMax |
Column numbers where the maximum will be chosen during the aggregation process. |
columns2drop |
Column numbers that will be dropped after the aggregation process. |
The following tables illustrate the principle of the collapsing:
Input data:
seqnames | start | end | column2collapseBy | moreColumns | columns2sumUp |
chr1 | 0 | 199 | 2 | 1 10 | 1 3 |
chr1 | 200 | 399 | 2 | 2 11 | 0 3 |
chr1 | 400 | 599 | 2 | 3 12 | 1 3 |
chr1 | 600 | 799 | 1 | 4 13 | 0 3 |
chr1 | 800 | 999 | 1 | 5 14 | 1 3 |
Output data:
seqnames | start | end | column2collapseBy | moreColumns | columns2sumUp |
chr1 | 0 | 599 | 2 | 1 10 | 2 9 |
chr1 | 600 | 999 | 1 | 4 13 | 1 6 |
A data.frame.
Aaron Taudt
## Get an example BED file with single-cell-sequencing reads bedfile <- system.file("extdata", "KK150311_VI_07.bam.bed.gz", package="AneuFinderData") ## Bin the BAM file into bin size 1Mp binned <- binReads(bedfile, assembly='mm10', binsize=1e6, chromosomes=c(1:19,'X','Y')) ## Collapse the bins by chromosome and get average, summed and maximum read count df <- as.data.frame(binned[[1]]) # Remove one bin for illustration purposes df <- df[-3,] head(df) collapseBins(df, column2collapseBy='seqnames', columns2sumUp=c('width','counts'), columns2average='counts', columns2getMax='counts', columns2drop=c('mcounts','pcounts')) collapseBins(df, column2collapseBy=NULL, columns2sumUp=c('width','counts'), columns2average='counts', columns2getMax='counts', columns2drop=c('mcounts','pcounts'))