systemPipeR 2.8.0
This section provides general description and how to use this cheminformatics workflow. In the actual analysis report, this section is usually removed.
This cheminformatics workflow template is based on the ChemmineR package and should be downloaded from Bioconductor before running the workflow. This template is a workflow that does:
There is no other command-line software required in this workflow. All are written in R (Linewise
) steps.
Users want to provide here background information about the design of their cheminformatics project.
This report describes the analysis of a cheminformatics project studying drug …
Typically, users want to specify here all information relevant for the analysis of their Cheminformatics study. This includes detailed descriptions of files, experimental design, reference genome, gene annotations, etc.
systemPipeR
workflows can be designed and built from start to finish with a
single command, importing from an R Markdown file or stepwise in interactive
mode from the R console.
This tutorial will demonstrate how to build the workflow in an interactive mode,
appending each step. The workflow is constructed by connecting each step via
appendStep
method. Each SYSargsList
instance contains instructions needed
for processing a set of input files with a specific command-line or R software
and the paths to the corresponding outfiles generated by a particular tool/step.
To create a Workflow within systemPipeR
, we can start by defining an empty
container and checking the directory structure:
library(systemPipeR)
sal <- SPRproject()
sal
This is an empty template that contains only one demo step. Refer to our website for how to add more steps. If you prefer a more enriched template, read this page for other pre-configured templates.
cat(crayon::blue$bold("To use this workflow, following R packages are expected:\n"))
cat(c("'ChemmineR", "ggplot2", "tibble", "readr", "ggpubr", "gplots'\n"),
sep = "', '")
### pre-end
appendStep(sal) <- LineWise(code = {
library(systemPipeR)
library(ChemmineR)
}, step_name = "load_packages")
Molecules can be loaded or downloaded. This example dataset has 100 molecules.
# Here, the dataset is downloaded. If you already have the
# data locally, change URL to local path.
appendStep(sal) <- LineWise(code = {
sdfset <- read.SDFset("http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/Samples/sdfsample.sdf")
# rename molecule IDs by IDs in the header. If your
# molecules' header does not have ID or not unique,
# remove following code and use the default IDs
cid(sdfset) <- makeUnique(sdfid(sdfset))
}, step_name = "load_data", dependency = "load_packages")
appendStep(sal) <- LineWise(code = {
png("results/mols_plot.png", 700, 600)
# Here only first 4 are plotted. Please choose the ones
# you want to plot.
ChemmineR::plot(sdfset[1:4])
dev.off()
}, step_name = "vis_mol", dependency = "load_data", run_step = "optional")
Compute some basic molecule information and store to file, such as atom frequency matrix, molecular weight and formula.
appendStep(sal) <- LineWise(code = {
propma <- data.frame(MF = MF(sdfset), MW = MW(sdfset), atomcountMA(sdfset))
readr::write_csv(propma, "results/basic_mol_info.csv")
}, step_name = "basic_mol_info", dependency = "load_data", run_step = "optional")
The information can be visualized, for example, a boxplot of atom frequency.
appendStep(sal) <- LineWise(code = {
png("results/atom_req.png", 700, 700)
boxplot(propma[, 3:ncol(propma)], col = "#6cabfa", main = "Atom Frequency")
dev.off()
}, step_name = "mol_info_plot", dependency = "basic_mol_info",
run_step = "optional")