.corr_distribution {proBatch} | R Documentation |
Calculates correlation of data matrix and calculates correlation distribution for all pairs of the replicated samples
.corr_distribution(data_matrix, repeated_samples, sample_annotation, biospecimen_id_col, sample_id_col, batch_col)
data_matrix |
features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. Usually the log transformed version of the original data |
repeated_samples |
if |
sample_annotation |
data matrix with 1) |
biospecimen_id_col |
column in |
sample_id_col |
name of the column in sample_annotation file, where the filenames (colnames of the data matrix) are found |
batch_col |
column in |
dataframe with the following columns, that
are suggested to use for plotting in
plot_sample_corr_distribution
as plot_param
:
replicate
batch_the_same
batch_replicate
batches
other columns are:
sample_id_1
& sample_id_2
, both
generated from sample_id_col
variable
correlation
- correlation of two corresponding samples
batch_1
& batch_2
or analogous,
created the same as sample_id_1