dStruct {dStruct} | R Documentation |
This function takes reactivity profiles for samples of two groups as input and identifies differentially reactive regions in three steps (see Choudhary et al., Genome Biology, 2019 for details). First, it regroups the samples into homogeneous and heteregenous sub-groups, which are used to compute the within-group and between-group nucleotide-wise d scores. Second, smoothed between- and within-group d score profiles are compared to construct candidate differential regions. Finally, unsmoothed between- and within-group d scores are compared using the Wilcoxon signed-rank test. The resulting p-values quantify the significance of difference in reactivity patterns between the two input groups.
dStruct( rdf, reps_A, reps_B, batches = FALSE, min_length = 11, check_signal_strength = TRUE, check_nucs = TRUE, check_quality = TRUE, quality = "auto", evidence = 0, signal_strength = 0.1, within_combs = NULL, between_combs = NULL, ind_regions = TRUE, gap = 1, get_FDR = TRUE, proximity_assisted = FALSE, proximity = 10, proximity_defined_length = 30 )
rdf |
Dataframe of reactivities for each sample. |
reps_A |
Number of replicates of group A. |
reps_B |
Number of replicates of group B. |
batches |
Logical suggesting if replicates of group A and B were performed in batches and are labelled accordingly. If TRUE, a heterogeneous/homogeneous subset may not have multiple samples from the same batch. |
min_length |
Minimum length of constructed regions. |
check_signal_strength |
Logical, if TRUE, construction of regions must be based on nucleotides that have a minimum absolute value of reactivity. |
check_nucs |
Logical, if TRUE, constructed regions must have a minimum number of nucleotides participating in Wilcoxon signed rank test. |
check_quality |
Logical, if TRUE, check constructed regions for quality. |
quality |
Worst allowed quality for a region to be tested. |
evidence |
Minimum evidence of increase in variation from within-group comparisons to between-group comparisons for a region to be tested. |
signal_strength |
Threshold for minimum signal strength. |
within_combs |
Data.frame with each column containing groupings of replicates of groups A or B, which will be used to assess within-group variation. |
between_combs |
Dataframe with each column containing groupings of replicates of groups A and B, which will be used to assess between-group variation. |
ind_regions |
Logical, if TRUE, test each region found in the transcript separately. |
gap |
Integer. Join regions if they are separated by these many nucleotides. |
get_FDR |
Logical, if FALSE, FDR is not reported. |
proximity_assisted |
Logical, if TRUE, proximally located regions are tested together. |
proximity |
Maximum distance between constructed regions for them to be considered proximal. |
proximity_defined_length |
If performing a "proximity-assisted" test, minimum end-to-end length of a region to be tested. |
Constructs regions, reports p-value and median difference of between-group and within-group d-scores for each region, and FDR for them.
Krishna Choudhary
Choudhary, K., Lai, Y. H., Tran, E. J., & Aviran, S. (2019). dStruct: identifying differentially reactive regions from RNA structurome profiling data. Genome biology, 20(1), 1-26.
#Load data from Lai et al., 2019 data(lai2019) #Run dStruct in de novo discovery mode for a transcript with id YAL042W. dStruct(rdf = lai2019[["YAL042W"]], reps_A = 3, reps_B = 2, batches = TRUE, min_length = 21, between_combs = data.frame(c("A3", "B1", "B2")), within_combs = data.frame(c("A1", "A2", "A3")), ind_regions = TRUE)