gene_filter {awst}R Documentation

Gene filtering based on heterogeneity

Description

This function filters out genes that show a low heterogeneity, as measured by Shannon's entropy.

Usage

## S4 method for signature 'matrix'
gene_filter(
  x,
  from = min(x, na.rm = TRUE),
  to = max(x, na.rm = TRUE),
  nBins = 20,
  heterogeneity_threshold = 0.1
)

## S4 method for signature 'SummarizedExperiment'
gene_filter(
  x,
  from = min(assay(x, awst_values), na.rm = TRUE),
  to = max(assay(x, awst_values), na.rm = TRUE),
  nBins = 20,
  heterogeneity_threshold = 0.1,
  awst_values = "awst"
)

Arguments

x

a matrix of transformed gene expression counts (typically the results of awst).

from

the minimum value from which to start binning data.

to

the maximum value for the binning of the data.

nBins

the number of bins.

heterogeneity_threshold

the trheshold used for the filtering.

awst_values

integer scalar or string indicating the assay that contains the awst-transformed values to use as input.

Details

Shannon's entropy is computed on the categorized data after AWST transformation. Those genes that show a lower entropy than the predefined threshold are deemed to carry too low information to be useful for the classification of the samples, and are hence removed.

Value

if 'x' is a matrix, it returns a filtered matrix. If 'x' is a 'SummarizedExperiment', it returns a filtered 'SummarizedExperiment'

Methods (by class)

References

Risso and Pagnotta (2019). Within-sample standardization and asymmetric winsorization lead to accurate classification of RNA-seq expression profiles. Manuscript in preparation.

Examples

set.seed(222)
x <- matrix(rpois(75, lambda=5), ncol=5, nrow=15)
a <- awst(x)
gene_filter(a)


[Package awst version 1.1.1 Index]