calculate_features {PrInCE} | R Documentation |
Calculate the six features that are used to discriminate interacting and non-interacting protein pairs based on co-elution profiles in PrInCE, namely: raw Pearson R value, cleaned Pearson R value, raw Pearson P-value, Euclidean distance, co-peak, and co-apex. Optionally, one or more of these can be disabled.
calculate_features( profile_matrix, gaussians, min_pairs = 0, pearson_R_raw = TRUE, pearson_R_cleaned = TRUE, pearson_P = TRUE, euclidean_distance = TRUE, co_peak = TRUE, co_apex = TRUE, n_pairs = FALSE, max_euclidean_quantile = 0.9 )
profile_matrix |
a numeric matrix of co-elution profiles, with proteins
in rows, or a |
gaussians |
a list of Gaussian mixture models fit to the profile matrix
by |
min_pairs |
minimum number of overlapping fractions between any given protein pair to consider a potential interaction |
pearson_R_raw |
if true, include the Pearson correlation (R) between raw profiles as a feature |
pearson_R_cleaned |
if true, include the Pearson correlation (R) between cleaned profiles as a feature |
pearson_P |
if true, include the P-value of the Pearson correlation between raw profiles as a feature |
euclidean_distance |
if true, include the Euclidean distance between cleaned profiles as a feature |
co_peak |
if true, include the 'co-peak score' (that is, the distance, in fractions, between the single highest value of each profile) as a feature |
co_apex |
if true, include the 'co-apex score' (that is, the minimum Euclidean distance between any pair of fit Gaussians) as a feature |
max_euclidean_quantile |
very high Euclidean distance values are trimmed
to avoid numerical precision issues; values above this quantile will be
replaced with the value at this quantile (default: |
a data frame containing the calculated features for all possible protein pairs