To validate our Retention Time (RT) prediction in this vignette file, we compare the predicted hydrophobicity value using the ssrc
method Krokhin et al. (2004) implemented in the
protViz package Panse and Grossmann (2019).
The following code snippet performs the comparison on the F255744 data. The file contains amino acid sequences representing the designed flycodes.
library(NestLink)
# load(url("http://fgcz-ms.uzh.ch/~cpanse/p1875/F255744.RData"))
# F255744 <- as.data.frame.mascot(F255744)
# now available through ExperimentHub
library(ExperimentHub)
eh <- ExperimentHub();
## snapshotDate(): 2019-10-22
load(query(eh, c("NestLink", "F255744.RData"))[[1]])
## see ?NestLink and browseVignettes('NestLink') for documentation
## loading from cache
.ssrc.mascot(F255744, scores = c(10, 20, 40, 50),
pch = 16,
col = rgb(0.1,0.1,0.1,
alpha = 0.1)
)
## [[1]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -38.954 -2.248 0.015 2.228 71.167
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.580e+00 2.030e-01 -27.48 <2e-16 ***
## xx$RTINSECONDS 8.849e-03 7.434e-05 119.04 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.884 on 12295 degrees of freedom
## Multiple R-squared: 0.5354, Adjusted R-squared: 0.5354
## F-statistic: 1.417e+04 on 1 and 12295 DF, p-value: < 2.2e-16
##
##
## [[2]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.387 -2.040 -0.042 1.930 46.035
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.976e+00 1.621e-01 -43.03 <2e-16 ***
## xx$RTINSECONDS 9.447e-03 6.018e-05 156.99 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.12 on 9835 degrees of freedom
## Multiple R-squared: 0.7148, Adjusted R-squared: 0.7147
## F-statistic: 2.464e+04 on 1 and 9835 DF, p-value: < 2.2e-16
##
##
## [[3]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.260 -1.963 -0.114 1.735 45.342
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.690e+00 1.784e-01 -43.11 <2e-16 ***
## xx$RTINSECONDS 9.781e-03 6.724e-05 145.46 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.506 on 5574 degrees of freedom
## Multiple R-squared: 0.7915, Adjusted R-squared: 0.7915
## F-statistic: 2.116e+04 on 1 and 5574 DF, p-value: < 2.2e-16
##
##
## [[4]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.570 -2.019 -0.142 1.754 45.200
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.827e+00 2.173e-01 -36.02 <2e-16 ***
## xx$RTINSECONDS 9.848e-03 8.271e-05 119.06 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.579 on 3650 degrees of freedom
## Multiple R-squared: 0.7952, Adjusted R-squared: 0.7952
## F-statistic: 1.418e+04 on 1 and 3650 DF, p-value: < 2.2e-16
Here is the output of the sessionInfo()
command.
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.10-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.10-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] scales_1.0.0 ggplot2_3.2.1
## [3] NestLink_1.2.0 ShortRead_1.44.0
## [5] GenomicAlignments_1.22.0 SummarizedExperiment_1.16.0
## [7] DelayedArray_0.12.0 matrixStats_0.55.0
## [9] Biobase_2.46.0 Rsamtools_2.2.0
## [11] GenomicRanges_1.38.0 GenomeInfoDb_1.22.0
## [13] BiocParallel_1.20.0 protViz_0.4.0
## [15] gplots_3.0.1.1 Biostrings_2.54.0
## [17] XVector_0.26.0 IRanges_2.20.0
## [19] S4Vectors_0.24.0 ExperimentHub_1.12.0
## [21] AnnotationHub_2.18.0 BiocFileCache_1.10.0
## [23] dbplyr_1.4.2 BiocGenerics_0.32.0
## [25] BiocStyle_2.14.0
##
## loaded via a namespace (and not attached):
## [1] httr_1.4.1 bit64_0.9-7
## [3] gtools_3.8.1 shiny_1.4.0
## [5] assertthat_0.2.1 interactiveDisplayBase_1.24.0
## [7] BiocManager_1.30.9 latticeExtra_0.6-28
## [9] blob_1.2.0 GenomeInfoDbData_1.2.2
## [11] yaml_2.2.0 BiocVersion_3.10.1
## [13] pillar_1.4.2 RSQLite_2.1.2
## [15] backports_1.1.5 lattice_0.20-38
## [17] glue_1.3.1 digest_0.6.22
## [19] RColorBrewer_1.1-2 promises_1.1.0
## [21] colorspace_1.4-1 htmltools_0.4.0
## [23] httpuv_1.5.2 Matrix_1.2-17
## [25] pkgconfig_2.0.3 bookdown_0.14
## [27] zlibbioc_1.32.0 purrr_0.3.3
## [29] xtable_1.8-4 gdata_2.18.0
## [31] later_1.0.0 tibble_2.1.3
## [33] withr_2.1.2 lazyeval_0.2.2
## [35] magrittr_1.5 crayon_1.3.4
## [37] mime_0.7 memoise_1.1.0
## [39] evaluate_0.14 hwriter_1.3.2
## [41] tools_3.6.1 stringr_1.4.0
## [43] munsell_0.5.0 AnnotationDbi_1.48.0
## [45] compiler_3.6.1 caTools_1.17.1.2
## [47] rlang_0.4.1 grid_3.6.1
## [49] RCurl_1.95-4.12 rappdirs_0.3.1
## [51] labeling_0.3 bitops_1.0-6
## [53] rmarkdown_1.16 gtable_0.3.0
## [55] codetools_0.2-16 DBI_1.0.0
## [57] curl_4.2 R6_2.4.0
## [59] knitr_1.25 dplyr_0.8.3
## [61] fastmap_1.0.1 bit_1.1-14
## [63] zeallot_0.1.0 KernSmooth_2.23-16
## [65] stringi_1.4.3 Rcpp_1.0.2
## [67] vctrs_0.2.0 tidyselect_0.2.5
## [69] xfun_0.10
Krokhin, O. V., R. Craig, V. Spicer, W. Ens, K. G. Standing, R. C. Beavis, and J. A. Wilkins. 2004. βAn improved model for prediction of retention times of tryptic peptides in ion pair reversed-phase HPLC: its application to protein peptide mapping by off-line HPLC-MALDI MS.β Mol. Cell Proteomics 3 (9):908β19.
Panse, Christian, and Jonas Grossmann. 2019. protViz: Visualizing and Analyzing Mass Spectrometry Related Data in Proteomics. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org.