dotproduct {MsCoreUtils} | R Documentation |
Calculate the normalized dot product (NDP).dotproduct
returns a numeric
value ranging between 0 and 1, where 0 indicates no similarity between the
two MS/MS features, while 1 indicates that the MS/MS features are identical.
dotproduct(x, y, m = 0.5, n = 0)
x |
|
y |
|
m |
|
n |
|
Each row in x
corresponds to the respective row in y
, i.e. the peaks
(entries "mz"
) per spectrum have to match.
m
and n
are weights given on the peak intensity and the m/z values
respectively. As default (m = 0.5
), the square root of the intensity
values are taken to calculate weights. With increasing values for m
, high
intensity values become more important for the similarity calculation,
i.e. the differences between intensities will be aggravated.
With increasing values for n
, high m/z values will be taken more into
account for similarity calculation. Especially when working with small
molecules, a value n > 0
can be set, to give a weight on the m/z values to
accommodate that shared fragments with higher m/z are less likely and will
mean that molecules might be more similar. If n != 0
, a warning will be
raised if the corresponding m/z values are not identical, since small
differences in m/z values will distort the similarity values with increasing
n
. If m=0
or n=0
, intensity values or m/z values, respectively, are not
taken into account.
The normalized dot product is calculated according to:
NDP = ∑(W_{S1,i} * W_{S2,i})^2 / (∑(W_{S1,i}^2) * ∑(W_{S2,i}^2))
, with W = [peak intensity]^m * [m/z]^n. For further information on normalized dot product see for example Li et al. (2015). Prior to calculating W_{S1} or W_{S2}, all intensity values are divided by the maximum intensity value and multiplied by 100.
numeric(1)
, dotproduct
returns a numeric similarity coefficient between
0 and 1.
Thomas Naake, thomasnaake@googlemail.com
Li et al. (2015): Navigating natural variation in herbivory-induced secondary metabolism in coyote tobacco populations using MS/MS structural analysis. PNAS, E4147–E4155, doi: 10.1073/pnas.1503106112.
x <- matrix(c(c(100.001, 100.002, NA, 300.01, 300.02, NA), c(2, 1.5, 0, 1.2, 0.9, 0)), ncol = 2,) y <- matrix(c(c(100.0, NA, 200.0, 300.002, 300.025, 300.0255), c(2, 0, 3, 1, 4, 0.4)), ncol = 2) colnames(x) <- colnames(y) <- c("mz", "intensity") dotproduct(x, y, m = 0.5, n = 0)