selex.run {SELEX} | R Documentation |
A function used to, in one shot,
Determine kmax on the crossValidationSample
with the minimum count determined by minCount
Build a Markov model on the trainingSample
and test it on the crossValidationSample
with kmax length K-mers used to determine model fit, and constructed using mmMethod
Calculate information gain for infoRange
K-mer lengths on the infoGainSample
, using the Markov model order with the highest R^2 to predict previous round values.
selex.run(trainingSample, crossValidationSample, minCount=100, infoGainSample, infoRange=NULL, mmMethod="DIVISION", mmWithLeftFlank=FALSE)
trainingSample |
A sample handle to the training dataset. |
crossValidationSample |
A sample handle to the cross-validation dataset. |
minCount |
The minimum count to be used. |
infoGainSample |
A sample handle to the dataset on which to perform the information gain analysis. |
infoRange |
The range of K-mer lengths for which the information gain should be calculated. If |
mmMethod |
A character string indicating the algorithm used to evaluate the Markov model conditional probabilities. Can be either |
mmWithLeftFlank |
Predict expected counts by considering the sequences in the left flank of the variable region. |
Please see the individual functions or ‘References’ for more details.
Not applicable
Slattery, M., Riley, T.R., Liu, P., Abe, N., Gomez-Alcala, P., Dror, I., Zhou, T., Rohs, R., Honig, B., Bussemaker, H.J.,and Mann, R.S. (2011) Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell 147:1270–1282.
Riley, T.R., Slattery, M., Abe, N., Rastogi, C., Liu, D., Mann, R.S., and Bussemaker, H.J. (2014) SELEX-seq: a method for characterizing the complete repertoire of binding site preferences for transcription factor complexes. Methods Mol. Biol. 1196:255–278.
selex.counts
, selex.countSummary
, selex.infogain
, selex.infogainSummary
, selex.mm
, selex.mmSummary
#Initialize the SELEX package #options(java.parameters="-Xmx1500M") #library(SELEX) # Configure the current session workDir = file.path(".", "SELEX_workspace") selex.config(workingDir=workDir,verbose=FALSE, maxThreadNumber= 4) # Extract sample data from package, including XML database sampleFiles = selex.exampledata(workDir) # Load all sample files using XML database selex.loadAnnotation(sampleFiles[3]) # Create sample handles r0 = selex.sample(seqName="R0.libraries", sampleName="R0.barcodeGC", round=0) r2 = selex.sample(seqName='R2.libraries', sampleName='ExdHox.R2', round=2) # Split the r0 sample into testing and training datasets r0.split = selex.split(sample=r0) # Run entire analysis selex.run(trainingSample=r0.split$train, crossValidationSample=r0.split$test, infoGainSample=r2) # Display results selex.mmSummary()[,c(1,2,3,4,5,6)] selex.infogainSummary()[,c(1,2,3,4,5)]