dStruct {dStruct}R Documentation

Performs de novo discovery of differentially reactive regions.

Description

This function takes reactivity profiles for samples of two groups as input and identifies differentially reactive regions in three steps (see Choudhary et al., Genome Biology, 2019 for details). First, it regroups the samples into homogeneous and heteregenous sub-groups, which are used to compute the within-group and between-group nucleotide-wise d scores. Second, smoothed between- and within-group d score profiles are compared to construct candidate differential regions. Finally, unsmoothed between- and within-group d scores are compared using the Wilcoxon signed-rank test. The resulting p-values quantify the significance of difference in reactivity patterns between the two input groups.

Usage

dStruct(
  rdf,
  reps_A,
  reps_B,
  batches = FALSE,
  min_length = 11,
  check_signal_strength = TRUE,
  check_nucs = TRUE,
  check_quality = TRUE,
  quality = "auto",
  evidence = 0,
  signal_strength = 0.1,
  within_combs = NULL,
  between_combs = NULL,
  ind_regions = TRUE,
  gap = 1,
  get_FDR = TRUE,
  proximity_assisted = FALSE,
  proximity = 10,
  proximity_defined_length = 30
)

Arguments

rdf

Dataframe of reactivities for each sample.

reps_A

Number of replicates of group A.

reps_B

Number of replicates of group B.

batches

Logical suggesting if replicates of group A and B were performed in batches and are labelled accordingly. If TRUE, a heterogeneous/homogeneous subset may not have multiple samples from the same batch.

min_length

Minimum length of constructed regions.

check_signal_strength

Logical, if TRUE, construction of regions must be based on nucleotides that have a minimum absolute value of reactivity.

check_nucs

Logical, if TRUE, constructed regions must have a minimum number of nucleotides participating in Wilcoxon signed rank test.

check_quality

Logical, if TRUE, check constructed regions for quality.

quality

Worst allowed quality for a region to be tested.

evidence

Minimum evidence of increase in variation from within-group comparisons to between-group comparisons for a region to be tested.

signal_strength

Threshold for minimum signal strength.

within_combs

Data.frame with each column containing groupings of replicates of groups A or B, which will be used to assess within-group variation.

between_combs

Dataframe with each column containing groupings of replicates of groups A and B, which will be used to assess between-group variation.

ind_regions

Logical, if TRUE, test each region found in the transcript separately.

gap

Integer. Join regions if they are separated by these many nucleotides.

get_FDR

Logical, if FALSE, FDR is not reported.

proximity_assisted

Logical, if TRUE, proximally located regions are tested together.

proximity

Maximum distance between constructed regions for them to be considered proximal.

proximity_defined_length

If performing a "proximity-assisted" test, minimum end-to-end length of a region to be tested.

Value

Constructs regions, reports p-value and median difference of between-group and within-group d-scores for each region, and FDR for them.

Author(s)

Krishna Choudhary

References

Choudhary, K., Lai, Y. H., Tran, E. J., & Aviran, S. (2019). dStruct: identifying differentially reactive regions from RNA structurome profiling data. Genome biology, 20(1), 1-26.

Examples

#Load data from Lai et al., 2019
data(lai2019)

#Run dStruct in de novo discovery mode for a transcript with id YAL042W.
dStruct(rdf = lai2019[["YAL042W"]], reps_A = 3, reps_B = 2,
    batches = TRUE, min_length = 21,
    between_combs = data.frame(c("A3", "B1", "B2")),
    within_combs = data.frame(c("A1", "A2", "A3")),
    ind_regions = TRUE)

[Package dStruct version 0.99.3 Index]