ChemmineOB 1.42.0
Note: the most recent version of this tutorial can be found here and a short overview slide show here.
ChemmineOB
provides an R interface to a subset of
cheminformatics functionalities implemented by the OpelBabel C++ project
(O’Boyle, Morley, and Hutchison 2008; O’Boyle et al. 2011). OpenBabel is an open source
cheminformatics toolbox that includes utilities for structure format
interconversions, descriptor calculations, compound similarity searching
and more. ChemineOB
aims to make a subset of these
utilities available from within R. For non-developers,
ChemineOB
is primarily intended to be used from
ChemmineR
(Cao et al. 2008; Backman, Cao, and Girke 2011; Wang et al. 2013) as an add-on package
rather than used directly.
To use the ChemmineOB
package on Linux or Mac, OpenBabel
2.3.0 or greater needs to be installed on a system. On Linux systems,
the OpenBabel header files are also required in order to compile ChemmineOB
. The windows distribution
will include its own version of OpenBabel. The OpenBabel site
(http://openbabel.org/wiki/Get_Open_Babel) provides excellent
instructions for installing the OpenBabel software on Mac or Linux
systems. The ChemmineR
and ChemmineOB
packages can be installed from within R with:
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install(c("ChemmineR", "ChemmineOB"))
library("ChemmineR")
library("ChemmineOB")
If the installation fails on Linux, you may need to manually set the locations of the open babel libraries and header files. This is best done through configure flags. For example, at the command prompt do:
$ R CMD INSTALL --configure-args='--with-openbabel-include=... --with-openbabel-lib=...' <ChemmineOB package file>
where the ‘…’ are replaced by the relevant paths. See the README file for more details.
Some OpenBabel modules are not avaible through ChemmineOB on windows. These currently include “MACCS” and “InChi”.
Detailed instructions for using ChemmineOB
are provided
in the vignette of the ChemmineR
package instead of this
document. The main reason for consolidating the documentation in one
central document rather than distributing it across several vignettes is
that it helps minimizing duplications and inconsistencies. It also is
the more suitable format for providing a task-oriented description of
functionalities for users. To obtain an overview of the OpenBabel
utilities supported by ChemmineOB
, we recommend
consulting the OpenBabel Functions section of the
ChemmineR
vignette. To open the ChemmineR
vignette from R, one can use the following command.
vignette("ChemmineR")
ChemmineOB
now includes wrapper functions for all of OpenBabel, as genereted by
SWIG. We still maintain our own set of functions to provide
better integration with R in general and ChemmineR
specifically.
If you are familiar with the Open Babel API, using the SWIG wrapper should be similar, once you know a few conventions used. You can look at the R code in this package to see examples of these.
OBConversion *x = new OBConversion(...)
in R you would have:
x = OBConversion(...)
x->AddOption(...)
we have:
OBConversion_AddOption(x,...)
stringp
function.
The char* pointer can be accessed with the cast
slot. The value can
be retrieved from the value
slot. For example:result = stringp()
OBDescriptor_GetStringValue(... , result$cast())
stringValue = result$value()
There are still many special cases however. The SWIG documentation can help, as well as browsing the generated R code in R/ChemmineOB.R.
sessionInfo()
R version 4.4.0 beta (2024-04-15 r86425) Platform: x86_64-pc-linux-gnu Running under: Ubuntu 22.04.4 LTS
Matrix products: default BLAS: /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York tzcode source: system (glibc)
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] ChemmineOB_1.42.0 BiocStyle_2.32.0
loaded via a namespace (and not attached):
[1] digest_0.6.35 R6_2.5.1 codetools_0.2-20 bookdown_0.39
[5] zlibbioc_1.50.0 fastmap_1.1.1 xfun_0.43 cachem_1.0.8
[9] knitr_1.46 htmltools_0.5.8.1 rmarkdown_2.26 lifecycle_1.0.4
[13] cli_3.6.2 sass_0.4.9 jquerylib_0.1.4 compiler_4.4.0
[17] tools_4.4.0 evaluate_0.23 bslib_0.7.0 yaml_2.3.8
[21] BiocManager_1.30.22 jsonlite_1.8.8 rlang_1.1.3
This software was developed with funding from the National Science Foundation: ABI-0957099, 2010-0520325 and IGERT-0504249.
Backman, T W, Y Cao, and T Girke. 2011. “ChemMine tools: an online service for analyzing and clustering small molecules.” Nucleic Acids Res 39 (Web Server issue): 486–91. https://doi.org/10.1093/nar/gkr320.
Cao, Y, A Charisi, L C Cheng, T Jiang, and T Girke. 2008. “ChemmineR: a compound mining framework for R.” Bioinformatics 24 (15): 1733–4. https://doi.org/10.1093/bioinformatics/btn307.
O’Boyle, Noel, Michael Banck, Craig James, Chris Morley, Tim Vandermeersch, and Geoffrey Hutchison. 2011. “Open Babel: An Open Chemical Toolbox.” Journal of Cheminformatics 3 (1): 33. https://doi.org/10.1186/1758-2946-3-33.
O’Boyle, Noel, Chris Morley, and Geoffrey Hutchison. 2008. “Pybel: A Python Wrapper for the Openbabel Cheminformatics Toolkit.” Chemistry Central Journal 2 (1): 5. https://doi.org/10.1186/1752-153X-2-5.
Wang, Y, T W Backman, K Horan, and T Girke. 2013. “fmcsR: Mismatch Tolerant Maximum Common Substructure Searching in R.” Bioinformatics, August. https://doi.org/10.1093/bioinformatics/btt475.