Contents

Prerequisite

This tutorial assume you have basic knowledge about docker concept.

Note: Right now we are supporting CWL draft 2 with SBG extension, but we will support CWL V1.0 soon.

App, Workflow and Tool

In our terminology, a workflow is composed of one or more tool, both of them are just app to users. You can imagine some raw input data go through a pipeline with many nodes that each step perform a function on the data in the flow, and in the end, you got want you want: a fully processed data or result (plot, report, action)

Here are some key ideas

Looks like full of jargons and hard to understand. Here is an example. You have a csv table, full of missing value and you want to process it in 3 step

  1. replace missing value
  2. filtering out rows that column “age” is smaller than 10
  3. output 3 item: a processed table csv file, a plot and a summary report in pdf.

You can describe each step into a single module or tool then connect them one by one to form a flow. You can put everything into one single “tool”, then downside is that other user cannot use your step1 for missing value problem. So it’s both art and sciense to leverage between flexibility and efficiency.

Why we are using CWL? Imagine a single file represeting a tool or workflow, could be executed anywhere in a reproducible manner and you don’t have to install anything because docker container is imaged, that’s going to change the world of computational scientific research and how we do research and publish results. In this package we are trying to hide CWL details as much as possible, so user can just use it like a typical R function.

Describe Tools in R

Tool is the basic unit, and also your “lego brick” you usually start with. As developer you also want to provide those “lego” piecies to users to directly run it or make their own flow with it.

The main interface provided by sevenbridges package is Tool function, it’s much more straight forward to describe than composing your raw CWL json file from scratch. A “Tool” object in R could be exported into JSON or imported from a CWL JSON file.

I highly recommend user go over documentation The Tool Editor chapter for cancer genomic cloud to understand how it works, and even try it on the platform with the GUI. This will help understand our R interface better.

Import from JSON file

Sometimes people share Tool in pure JSON text format. You can simply load it into R by using convert_app function, this will recognize your JSON file class (Tool or Workflow) automatically.

library(sevenbridges)
t1 = system.file("extdata/app", "tool_star.json", package = "sevenbridges")
## convert json file into a Tool object
t1 = convert_app(t1)
## try print it yourself
## t1

In this way, you can load it, revise it, use it with API or edit and export it back to JSON file. However, in this tutorial, the most important thing is that you learn how to desribe it directly in R.

Utilitites for Tool object

We provide couple utitlities to help construct your own CWL tool quickly in R. For all availale utiles please check out help("Tool")

Some utiles you will find it useful when you execute a task, you need to know what is the input type and what is the input id and if it’s required or not, so you can execute the task with parameters it need. Try play with input_matrix or input_type as shown below.

## get input type information
head(t1$input_type())
              reads  readMatesLengthsIn       readMapNumber 
          "File..."              "enum"               "int" 
  limitOutSJoneRead limitOutSJcollapsed    outReadsUnmapped 
              "int"               "int"              "enum" 
## get output type information
head(t1$output_type())
              aligned_reads transcriptome_aligned_reads 
                     "File"                      "File" 
             reads_per_gene                   log_files 
                     "File"                   "File..." 
           splice_junctions          chimeric_junctions 
                     "File"                      "File" 
## return a input matrix with more informtion
head(t1$input_matrix())
                     id                label    type required
1                #reads        Read sequence File...     TRUE
95         #sjdbGTFfile Splice junction file File...    FALSE
102             #genome         Genome files    File     TRUE
2   #readMatesLengthsIn        Reads lengths    enum    FALSE
3        #readMapNumber         Reads to map     int    FALSE
4    #limitOutSJoneRead Junctions max number     int    FALSE
                  prefix
1                   <NA>
95                  <NA>
102                 <NA>
2   --readMatesLengthsIn
3        --readMapNumber
4    --limitOutSJoneRead
                                                   fileTypes
1   FASTA, FASTQ, FA, FQ, FASTQ.GZ, FQ.GZ, FASTQ.BZ2, FQ.BZ2
95                                             GTF, GFF, TXT
102                                                      TAR
2                                                       null
3                                                       null
4                                                       null
## return only a few fields
head(t1$input_matrix(c("id", "type", "required")))
                     id    type required
1                #reads File...     TRUE
95         #sjdbGTFfile File...    FALSE
102             #genome    File     TRUE
2   #readMatesLengthsIn    enum    FALSE
3        #readMapNumber     int    FALSE
4    #limitOutSJoneRead     int    FALSE
## return only required
t1$input_matrix(required = TRUE)
         id         label    type required prefix
1    #reads Read sequence File...     TRUE   <NA>
102 #genome  Genome files    File     TRUE   <NA>
                                                   fileTypes
1   FASTA, FASTQ, FA, FQ, FASTQ.GZ, FQ.GZ, FASTQ.BZ2, FQ.BZ2
102                                                      TAR
## return a output matrix with more informtion
t1$output_matrix()
                            id                     label    type fileTypes
1               #aligned_reads           Aligned SAM/BAM    File  SAM, BAM
2 #transcriptome_aligned_reads  Transcriptome alignments    File       BAM
3              #reads_per_gene            Reads per gene    File       TAB
4                   #log_files                 Log files File...       OUT
5            #splice_junctions          Splice junctions    File       TAB
6          #chimeric_junctions        Chimeric junctions    File  JUNCTION
7              #unmapped_reads            Unmapped reads File...     FASTQ
8         #intermediate_genome Intermediate genome files    File       TAR
9         #chimeric_alignments       Chimeric alignments    File       SAM
## return only a few fields
t1$output_matrix(c("id", "type"))
                            id    type
1               #aligned_reads    File
2 #transcriptome_aligned_reads    File
3              #reads_per_gene    File
4                   #log_files File...
5            #splice_junctions    File
6          #chimeric_junctions    File
7              #unmapped_reads File...
8         #intermediate_genome    File
9         #chimeric_alignments    File
## get required input id
t1$get_required()
    reads    genome 
"File..."    "File" 
## set new required input with ID, # or without #
t1$set_required(c("#reads", "winFlankNbins"))
[1] TRUE TRUE
t1$get_required()
        reads winFlankNbins        genome 
    "File..."         "int"        "File" 
## turn off requirements for input node #reads
t1$set_required("reads", FALSE)
[1] FALSE
t1$get_required()
winFlankNbins        genome 
        "int"        "File" 
#' ## get input id
head(t1$input_id())
                 #STAR                  #STAR                  #STAR 
              "#reads"  "#readMatesLengthsIn"       "#readMapNumber" 
                 #STAR                  #STAR                  #STAR 
  "#limitOutSJoneRead" "#limitOutSJcollapsed"    "#outReadsUnmapped" 
#' ## get full input id with Tool name
head(t1$input_id(TRUE))
                    File...                        enum 
              "#STAR.reads"  "#STAR.readMatesLengthsIn" 
                        int                         int 
      "#STAR.readMapNumber"   "#STAR.limitOutSJoneRead" 
                        int                        enum 
"#STAR.limitOutSJcollapsed"    "#STAR.outReadsUnmapped" 
## get output id
head(t1$output_id())
                         #STAR                          #STAR 
              "#aligned_reads" "#transcriptome_aligned_reads" 
                         #STAR                          #STAR 
             "#reads_per_gene"                   "#log_files" 
                         #STAR                          #STAR 
           "#splice_junctions"          "#chimeric_junctions" 
## get full output id
head(t1$output_id(TRUE))
                               File                                File 
              "#STAR.aligned_reads" "#STAR.transcriptome_aligned_reads" 
                               File                             File... 
             "#STAR.reads_per_gene"                   "#STAR.log_files" 
                               File                                File 
           "#STAR.splice_junctions"          "#STAR.chimeric_junctions" 
## get input and output object
t1$get_input(id = "#winFlankNbins")
type:
- 'null'
- int
label: Flanking regions size
description: =log2(winFlank), where win Flank is the size of the left and right flanking
  regions for each window (int>0).
streamable: no
id: '#winFlankNbins'
inputBinding:
  position: 0
  prefix: --winFlankNbins
  separate: yes
  sbg:cmdInclude: yes
sbg:category: Windows, Anchors, Binning
sbg:toolDefaultValue: '4'
required: yes
t1$get_input(name = "ins")
[[1]]
type:
- 'null'
- int
label: Max bins between anchors
description: Max number of bins between two anchors that allows aggregation of anchors
  into one window (int>0).
streamable: no
id: '#winAnchorDistNbins'
inputBinding:
  position: 0
  prefix: --winAnchorDistNbins
  separate: yes
  sbg:cmdInclude: yes
sbg:category: Windows, Anchors, Binning
sbg:toolDefaultValue: '9'
required: no

[[2]]
type:
- 'null'
- int
label: Max insert junctions
description: Maximum number of junction to be inserted to the genome on the fly at
  the mapping stage, including those from annotations and those detected in the 1st
  step of the 2-pass run.
streamable: no
id: '#limitSjdbInsertNsj'
inputBinding:
  position: 0
  prefix: --limitSjdbInsertNsj
  separate: yes
  sbg:cmdInclude: yes
sbg:category: Limits
sbg:toolDefaultValue: '1000000'
required: no
t1$get_output(id = "#aligned_reads")
type:
- 'null'
- File
label: Aligned SAM/BAM
description: Aligned sequence in SAM/BAM format.
streamable: no
id: '#aligned_reads'
outputBinding:
  glob:
    engine: '#cwl-js-engine'
    script: |-
      {
        if ($job.inputs.outSortingType == 'SortedByCoordinate') {
          sort_name = '.sortedByCoord'
        }
        else {
          sort_name = ''
        }
        if ($job.inputs.outSAMtype == 'BAM') {
          sam_name = "*.Aligned".concat( sort_name, '.out.bam')
        }
        else {
          sam_name = "*.Aligned.out.sam"
        }
        return sam_name
      }
    class: Expression
sbg:fileTypes: SAM, BAM
t1$get_output(name = "gene")
type:
- 'null'
- File
label: Reads per gene
description: File with number of reads per gene. A read is counted if it overlaps
  (1nt or more) one and only one gene.
streamable: no
id: '#reads_per_gene'
outputBinding:
  glob: '*ReadsPerGene*'
sbg:fileTypes: TAB

Create your own tool in R

Introduction

Before we continue, this is how it looks like for full tool description, you don’t always need to describe all those details, following section will walk you through simple examples to full examples like this one.

fl <- system.file("docker/rnaseqGene/rabix", "generator.R", package = "sevenbridges")
cat(readLines(fl), sep = '\n')
library(sevenbridges)

rbx <- Tool(id = "rnaseqGene", 
            label = "rnaseqgene",
            description = "A RNA-seq Differiencial Expression Flow and Report",
            hints = requirements(docker(pull = "tengfei/rnaseqgene"), cpu(1), mem(2000)), 
            baseCommand = "performDE.R", 
            inputs = list(
                input(
                    id = "bamfiles", label = "bam files",
                    description = "a list of bam files",
                    type = "File...",  ## or type = ItemArray("File")
                    prefix = "--bamfiles",
                    required = TRUE,
                    itemSeparator = ","
                ), 
                input(
                    id = "design", label = "design matrix",
                    type = "File",
                    required = TRUE,
                    prefix = "--design"
                ),
                input(
                    id = "gtffile", label =  "gene feature files",
                    type = "File",
                    stageInput = "copy",
                    required = TRUE,
                    prefix = "--gtffile"
                ),
                input(
                    id = "format", label =  "report foramt html or pdf",
                    type = enum("format", c("pdf", "html")),
                    prefix = "--format"
                )
            ),
            outputs = list(
                output(id = "report", label = "report", 
                       description = "A reproducible report created by Rmarkdown",
                       glob = Expression(engine = "#cwl-js-engine",
                                         script = "x = $job[['inputs']][['format']];
                                                  if(x == 'undefined' || x == null){
                                                   x = 'html';
                                                    };
                                                  'rnaseqGene.' +  x")),
                output(id = "heatmap", label = "heatmap", 
                       description = "A heatmap plot to show the Euclidean distance between samples",
                       glob = "heatmap.pdf"),
                output(id = "count", label = "count", 
                       description = "Reads counts matrix",
                       glob = "count.csv"),
                output(id = "de", label = "Differential expression table", 
                       description = "Differential expression table",
                       glob = "de.csv")
                ))

fl <- "inst/docker/rnaseqGene/rabix/rnaseqGene.json"
write(rbx$toJSON(pretty = TRUE), fl)

Now let’s break it down:

Some key arguments used in Tool function.

  • baseCommand: Specifies the program to execute.
  • stdout: Capture the command’s standard output stream to a file written to the designated output directory. You don’t need this, if you specify output files to collect.
  • inputs: inputs for your command line
  • outputs: outputs you want to collect
  • Requirements and hints: in short, hints are not required for execution. We now accept following requirement items cpu, mem, docker, fileDef; and you can easily construct them via requirements() constructor. This is how you describe the resources you need to execute the tool, so the system knows what type of instances suit your case best.

To specify inputs and outpus, usually your command line interface accept extra arguments as input, for example, file(s), string, enum, int, float, boolean. So to specify that in your tool, you can use input function, then pass it to the inputs arguments as a list or single item. You can even construct them as data.frame with less flexibility. input() require arguments id and type. output() require arguments id because type by default is file.

There are some special type: ItemArray and enum. For ItemArray the type could be an array of single type, the most common case is that if your input is a list of files, you can do something like type = ItemArray("File") or as simple as type = "File..." to diffenciate from a single file input. When you add “…” suffix, R will know it’s an ItemArray.

We also provide an enum type, when you specify the enum, please pass the required name and symbols like this type = enum("format", c("pdf", "html")) then in the UI on the platform you will be poped with drop down when you execute the task.

Now let’s work though from simple case to most flexible case.

Using existing docker image and command

If you already have a docker image in mind that provide the functionality you need, you can just use it. The baseCommand is the command line you want to execute in that container. stdout specify the output file you want to capture the standard output and collect it on the platform.

In this simple example, I know docker image “rocker/r-base” has a function called runif I can directly called in command line with Rscript -e. Then I want the ouput is collected in stdout and ask the file system to capture the files matches “*.txt“. Please pay attention to this, you tool may produce many intermediate files in current folder, if you don’t tell which output you need, they will all be ignored, so make sure you collect those files via outputs parameter.

library(sevenbridges)
rbx <- Tool(id = "runif", 
            label = "runif",
            hints = requirements(docker(pull = "rocker/r-base")), 
            baseCommand = "Rscript -e 'runif(100)'", 
            stdout = "output.txt",
            outputs = output(id = "random", glob = "*.txt"))

rbx
sbg:id: runif
id: '#runif'
inputs: []
outputs:
- type:
  - 'null'
  - File
  label: ''
  description: ''
  streamable: no
  default: ''
  id: '#random'
  outputBinding:
    glob: '*.txt'
requirements: []
hints:
- class: DockerRequirement
  dockerPull: rocker/r-base
label: runif
class: CommandLineTool
baseCommand:
- Rscript -e 'runif(100)'
arguments: []
stdout: output.txt
rbx$toJSON()
{"sbg:id":"runif","id":"#runif","inputs":[],"outputs":[{"type":["null","File"],"label":"","description":"","streamable":false,"default":"","id":"#random","outputBinding":{"glob":"*.txt"}}],"requirements":[],"hints":[{"class":"DockerRequirement","dockerPull":"rocker/r-base"}],"label":"runif","class":"CommandLineTool","baseCommand":["Rscript -e 'runif(100)'"],"arguments":[],"stdout":"output.txt"} 

By default the tool object shows YAML, but you can simply convert it to JSON and copy it to your seven bridges platform graphic editor by importing JSON.

rbx$toJSON()
{"sbg:id":"runif","id":"#runif","inputs":[],"outputs":[{"type":["null","File"],"label":"","description":"","streamable":false,"default":"","id":"#random","outputBinding":{"glob":"*.txt"}}],"requirements":[],"hints":[{"class":"DockerRequirement","dockerPull":"rocker/r-base"}],"label":"runif","class":"CommandLineTool","baseCommand":["Rscript -e 'runif(100)'"],"arguments":[],"stdout":"output.txt"} 
rbx$toJSON(pretty = TRUE)
{
  "sbg:id": "runif",
  "id": "#runif",
  "inputs": [],
  "outputs": [
    {
      "type": ["null", "File"],
      "label": "",
      "description": "",
      "streamable": false,
      "default": "",
      "id": "#random",
      "outputBinding": {
        "glob": "*.txt"
      }
    }
  ],
  "requirements": [],
  "hints": [
    {
      "class": "DockerRequirement",
      "dockerPull": "rocker/r-base"
    }
  ],
  "label": "runif",
  "class": "CommandLineTool",
  "baseCommand": [
    "Rscript -e 'runif(100)'"
  ],
  "arguments": [],
  "stdout": "output.txt"
} 
rbx$toYAML()
[1] "sbg:id: runif\nid: '#runif'\ninputs: []\noutputs:\n- type:\n  - 'null'\n  - File\n  label: ''\n  description: ''\n  streamable: no\n  default: ''\n  id: '#random'\n  outputBinding:\n    glob: '*.txt'\nrequirements: []\nhints:\n- class: DockerRequirement\n  dockerPull: rocker/r-base\nlabel: runif\nclass: CommandLineTool\nbaseCommand:\n- Rscript -e 'runif(100)'\narguments: []\nstdout: output.txt\n"

Add customized script to existing docker image

Now you make want to run your own R script, but you still don’t want to create new command line and a new docker image. You just want to run your script with new input files in existing container, it’s time to introduce fileDef. You can either directly write script as string or just import a R file to content. And provided as requirements.

## Make a new file
fd <- fileDef(name = "runif.R",
              content = "set.seed(1)
                   runif(100)")

## read via reader
.srcfile <- system.file("docker/sevenbridges/src/runif.R", package = "sevenbridges")
library(readr)
fd <- fileDef(name = "runif.R",
              content = read_file(.srcfile))

## add script to your tool
rbx <- Tool(id = "runif", 
            label = "runif",
            hints = requirements(docker(pull = "rocker/r-base")),
            requirements = requirements(fd),
            baseCommand = "Rscript runif.R",
            stdout = "output.txt",
            outputs = output(id = "random", glob = "*.txt"))   

How about multiple script?

## or simply readLines
.srcfile <- system.file("docker/sevenbridges/src/runif.R", package = "sevenbridges")
library(readr)
fd1 <- fileDef(name = "runif.R",
              content = read_file(.srcfile))
fd2 <- fileDef(name = "runif2.R",
              content = "set.seed(1)
                   runif(100)")

rbx <- Tool(id = "runif_twoscript", 
            label = "runif_twoscript",
            hints = requirements(docker(pull = "rocker/r-base")),
            requirements = requirements(fd1, fd2),
            baseCommand = "Rscript runif.R",
            stdout = "output.txt",
            outputs = output(id = "random", glob = "*.txt"))   

Create formal interface for your command line

All those examples above, many parameters are hard-coded in your script, you don’t have flexiblity to control how many numbers to generate. Most often, your tools or command line tools expose some inputs arguments to users. You need a better way to describe a command line with input/output.

Now we bring the example to next level, for example, I prepare a docker image called “tengfei/runif” on dockerhub, this container has a exeutable command called “runif.R”, you don’t have to know what’s inside, you only have to know when you run the command line in that container it looks like this

runif.R --n=100 --max=100 --min=1 --seed=123 

This command outpus two files directly, so you don’t need standard output to capture random number.

  • output.txt
  • report.html

So the goal here is to describe this command and expose all input parameters and collect all two files.

To define input, you can specify

  • id : unique identifier to this input node.
  • description: description, also visible on UI.
  • type: required to specify input types, files, integer, or character.
  • label: human readable label for this input node.
  • prefix: the prefix in command line for this input parameter.
  • default: default value for this input.
  • required: is this input parameter required or not. If required, when you execte the tool you have to provide a value for the parameter.
  • cmdInclude: included in command line or not.

Output is similar, espeicaly when you want to collect file, you can use glob for pattern matching.

## pass a input list
in.lst <- list(input(id = "number",
                     description = "number of observations",
                     type = "integer",
                     label = "number",
                     prefix = "--n",
                     default = 1,
                     required = TRUE, 
                     cmdInclude = TRUE),
               input(id = "min",
                     description = "lower limits of the distribution",
                     type = "float",
                     label = "min",
                     prefix = "--min",
                     default = 0),
               input(id = "max",
                     description = "upper limits of the distribution",
                     type = "float",
                     label = "max",
                     prefix = "--max",
                     default = 1),
               input(id = "seed",
                     description = "seed with set.seed",
                     type = "float",
                     label = "seed",
                     prefix = "--seed",
                     default = 1))


## the same method for outputs
out.lst <- list(output(id = "random",
                       type = "file",
                       label = "output", 
                       description = "random number file",
                       glob = "*.txt"),
                output(id = "report",
                       type = "file",
                       label = "report", 
                       glob = "*.html"))


rbx <- Tool(id = "runif",
            label = "Random number generator",
            hints = requirements(docker(pull = "tengfei/runif")),
            baseCommand = "runif.R",
            inputs = in.lst, ## or ins.df
            outputs = out.lst)

Alternatively you can use data.frame as example for input and output, but it’s less flexible.

in.df <- data.frame(id = c("number", "min", "max", "seed"),
                    description = c("number of observation", 
                                    "lower limits of the distribution",
                                    "upper limits of the distribution",
                                    "seed with set.seed"),
                    type = c("integer", "float", "float", "float"),
                    label = c("number" ,"min", "max", "seed"), 
                    prefix = c("--n", "--min", "--max", "--seed"),
                    default = c(1, 0, 10, 123), 
                    required = c(TRUE, FALSE, FALSE, FALSE))

out.df <- data.frame(id = c("random", "report"),
                     type = c("file", "file"),
                     glob = c("*.txt", "*.html"))

rbx <- Tool(id = "runif",
            label = "Random number generator",
            hints = requirements(docker(pull = "tengfei/runif"), 
                                 cpu(1), mem(2000)),
            baseCommand = "runif.R",
            inputs = in.df, ## or ins.df
            outputs = out.df)

Quick command line interface with commandArgs (position and named args)

Now you must be wondering, I have a docker container with R, but I don’t have any existing command line that I could directly use. Can I provide a script with a formal and quick command line interface to make an App for existing container. The anwser is yes. When you add script to your tool, you can always use some trick to do so, one popular one you may already head of is commandArgs. More formal one is called “docopt” which I will show you later.

Suppose you have a R script “runif2spin.R” with three arguments using position mapping

  1. numbers
  2. min
  3. max

My base command will be somethine like

Rscript runif2spin.R 10 30 50

This is how you do in your R script

fl <- system.file("docker/sevenbridges/src", "runif2spin.R", package = "sevenbridges")
cat(readLines(fl), sep = '\n')
#'---
#'title: "Uniform randome number generator example"
#'output:
#'    html_document:
#'    toc: true
#'number_sections: true
#'highlight: haddock
#'---
    
#'## summary report
#'
#'This is a randome number generator

#+
args <- commandArgs(TRUE)

r <- runif(n = as.integer(args[1]),
           min = as.numeric(args[2]),
           max = as.numeric(args[3]))
head(r)
summary(r)
hist(r)

Ignore the comment part, I will introduce spin/stich later.

Then just describe my tool in this way, add your script as you learned in previous sections.

library(readr)
fd <- fileDef(name = "runif.R",
              content = read_file(fl))

rbx <- Tool(id = "runif", 
            label = "runif",
            hints = requirements(docker(pull = "rocker/r-base"), 
                                 cpu(1), mem(2000)),
            requirements = requirements(fd),
            baseCommand = "Rscript runif.R",
            stdout = "output.txt",
            inputs = list(input(id = "number",
                                type = "integer",
                                position = 1),
                          input(id = "min",
                                type = "float",
                                position = 2),
                          input(id = "max",
                                type = "float",
                                position = 3)),
            outputs = output(id = "random", glob = "output.txt"))   

How about named argumentments? I will still recommend use “docopt” package, but for simple way. You want command line looks like this

Rscript runif_args.R --n=10 --min=30 --max=50

Here is how you do in R script.

fl <- system.file("docker/sevenbridges/src", "runif_args.R", package = "sevenbridges")
cat(readLines(fl), sep = '\n')
Warning in readLines(fl): incomplete final line found on '/tmp/RtmpONvHXF/
Rinst62fe424537de/sevenbridges/docker/sevenbridges/src/runif_args.R'
#'---
#'title: "Uniform randome number generator example"
#'output:
#'    html_document:
#'    toc: true
#'number_sections: true
#'highlight: haddock
#'---

#'## summary report
#'
#'This is a randome number generator

#+
args <- commandArgs(TRUE)

## quick hack to split named arguments
splitArgs <- function(x){
    res <- do.call(rbind, lapply(x, function(i){
        res <- strsplit(i, "=")[[1]] 
        nm <- gsub("-+", "",res[1])
        c(nm, res[2])
    }))
    .r <- res[,2]
    names(.r) <- res[,1]
    .r
}
args <- splitArgs(args)

#+
r <- runif(n = as.integer(args["n"]),
           min = as.numeric(args["min"]),
           max = as.numeric(args["max"]))
summary(r)
hist(r)
write.csv(r, file = "out.csv")

Then just describe my tool in this way, note, I use separate=FALSE and add = to my prefix as a hack.

library(readr)
fd <- fileDef(name = "runif.R",
              content = read_file(fl))

rbx <- Tool(id = "runif", 
            label = "runif",
            hints = requirements(docker(pull = "rocker/r-base"), 
                                 cpu(1), mem(2000)),
            requirements = requirements(fd),
            baseCommand = "Rscript runif.R",
            stdout = "output.txt",
            inputs = list(input(id = "number",
                                type = "integer",
                                separate = FALSE,
                                prefix = "--n="),
                          input(id = "min",
                                type = "float",
                                separate = FALSE,
                                prefix = "--min="),
                          input(id = "max",
                                type = "float",
                                 separate = FALSE,
                                prefix = "--max=")),
            outputs = output(id = "random", glob = "output.txt"))   

docopt: a better and formal way to make command line interface

Generate reports

Quick report: Spin and Stich

You can use spin/stich from knitr to generate report directly from a Rscript with special format. For example, let’s use above example

fl <- system.file("docker/sevenbridges/src", "runif_args.R", package = "sevenbridges")
cat(readLines(fl), sep = '\n')
Warning in readLines(fl): incomplete final line found on '/tmp/RtmpONvHXF/
Rinst62fe424537de/sevenbridges/docker/sevenbridges/src/runif_args.R'
#'---
#'title: "Uniform randome number generator example"
#'output:
#'    html_document:
#'    toc: true
#'number_sections: true
#'highlight: haddock
#'---

#'## summary report
#'
#'This is a randome number generator

#+
args <- commandArgs(TRUE)

## quick hack to split named arguments
splitArgs <- function(x){
    res <- do.call(rbind, lapply(x, function(i){
        res <- strsplit(i, "=")[[1]] 
        nm <- gsub("-+", "",res[1])
        c(nm, res[2])
    }))
    .r <- res[,2]
    names(.r) <- res[,1]
    .r
}
args <- splitArgs(args)

#+
r <- runif(n = as.integer(args["n"]),
           min = as.numeric(args["min"]),
           max = as.numeric(args["max"]))
summary(r)
hist(r)
write.csv(r, file = "out.csv")

You command is something like this

Rscript -e "rmarkdown::render(knitr::spin('runif_args.R', FALSE))" --args --n=100 --min=30 --max=50

And so I describe my tool like this with docker image rocker/hadleyverse this contians knitr and rmarkdown package.

library(readr)
fd <- fileDef(name = "runif.R",
              content = read_file(fl))

rbx <- Tool(id = "runif", 
            label = "runif",
            hints = requirements(docker(pull = "rocker/hadleyverse"), 
                                 cpu(1), mem(2000)),
            requirements = requirements(fd),
            baseCommand = "Rscript -e \"rmarkdown::render(knitr::spin('runif.R', FALSE))\" --args",
            stdout = "output.txt",
            inputs = list(input(id = "number",
                                type = "integer",
                                 separate = FALSE,
                                prefix = "--n="),
                          input(id = "min",
                                type = "float",
                                 separate = FALSE,
                                prefix = "--min="),
                          input(id = "max",
                                type = "float",
                                 separate = FALSE,
                                prefix = "--max=")),
            outputs = list(output(id = "stdout", type = "file", glob = "output.txt"),
                           output(id = "random", type = "file", glob = "*.csv"),
                           output(id = "report", type = "file", glob = "*.html")))

You will get a report in the end

Misc

Inherit metadata and additional metadata

Sometimes if you want your output files inherit from particular input file, just use inheritMetadataFrom in your output() call and pass the input file id. If you want to add additional metadata, you could pass metadata a list in your output() function call. For example, I want my output report inherit all metadata from my “bam_file” input node (which I don’t have in this example though) with two additional metadata fields.

out.lst <- list(output(id = "random",
                       type = "file",
                       label = "output", 
                       description = "random number file",
                       glob = "*.txt"),
                output(id = "report",
                       type = "file",
                       label = "report", 
                       glob = "*.html",
                       inheritMetadataFrom = "bam_file",
                       metadata = list(author = "tengfei",
                                       sample = "random")))
out.lst
[[1]]
type:
- 'null'
- File
label: output
description: random number file
streamable: no
default: ''
id: '#random'
outputBinding:
  glob: '*.txt'


[[2]]
type:
- 'null'
- File
label: report
description: ''
streamable: no
default: ''
id: '#report'
outputBinding:
  glob: '*.html'
  sbg:inheritMetadataFrom: '#bam_file'
  sbg:metadata:
    author: tengfei
    sample: random
Example with file/files as input node
fl <- system.file("docker/rnaseqGene/rabix", "generator.R", package = "sevenbridges")
cat(readLines(fl), sep = '\n')
library(sevenbridges)

rbx <- Tool(id = "rnaseqGene", 
            label = "rnaseqgene",
            description = "A RNA-seq Differiencial Expression Flow and Report",
            hints = requirements(docker(pull = "tengfei/rnaseqgene"), cpu(1), mem(2000)), 
            baseCommand = "performDE.R", 
            inputs = list(
                input(
                    id = "bamfiles", label = "bam files",
                    description = "a list of bam files",
                    type = "File...",  ## or type = ItemArray("File")
                    prefix = "--bamfiles",
                    required = TRUE,
                    itemSeparator = ","
                ), 
                input(
                    id = "design", label = "design matrix",
                    type = "File",
                    required = TRUE,
                    prefix = "--design"
                ),
                input(
                    id = "gtffile", label =  "gene feature files",
                    type = "File",
                    stageInput = "copy",
                    required = TRUE,
                    prefix = "--gtffile"
                ),
                input(
                    id = "format", label =  "report foramt html or pdf",
                    type = enum("format", c("pdf", "html")),
                    prefix = "--format"
                )
            ),
            outputs = list(
                output(id = "report", label = "report", 
                       description = "A reproducible report created by Rmarkdown",
                       glob = Expression(engine = "#cwl-js-engine",
                                         script = "x = $job[['inputs']][['format']];
                                                  if(x == 'undefined' || x == null){
                                                   x = 'html';
                                                    };
                                                  'rnaseqGene.' +  x")),
                output(id = "heatmap", label = "heatmap", 
                       description = "A heatmap plot to show the Euclidean distance between samples",
                       glob = "heatmap.pdf"),
                output(id = "count", label = "count", 
                       description = "Reads counts matrix",
                       glob = "count.csv"),
                output(id = "de", label = "Differential expression table", 
                       description = "Differential expression table",
                       glob = "de.csv")
                ))

fl <- "inst/docker/rnaseqGene/rabix/rnaseqGene.json"
write(rbx$toJSON(pretty = TRUE), fl)

Note the stageInput example in the above script, you can set it to “copy” or “link”.

Input node batch mode

Batch by File

f1 = system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 = convert_app(f1)
f1$set_batch("sjdbGTFfile", type = "ITEM")
sbg:validationErrors: []
sbg:sbgMaintained: no
sbg:latestRevision: 2
sbg:toolAuthor: Seven Bridges Genomics
sbg:createdOn: 1463601910
sbg:categories:
- Alignment
- RNA
sbg:contributors:
- tengfei
sbg:project: tengfei/quickstart
sbg:createdBy: tengfei
sbg:toolkitVersion: 2.4.2a
sbg:id: tengfei/quickstart/rna-seq-alignment-star-demo/2
sbg:license: Apache License 2.0
sbg:revision: 2
sbg:modifiedOn: 1463601974
sbg:modifiedBy: tengfei
sbg:revisionsInfo:
- sbg:modifiedBy: tengfei
  sbg:modifiedOn: 1463601910
  sbg:revision: 0
- sbg:modifiedBy: tengfei
  sbg:modifiedOn: 1463601952
  sbg:revision: 1
- sbg:modifiedBy: tengfei
  sbg:modifiedOn: 1463601974
  sbg:revision: 2
sbg:toolkit: STAR
id: '#tengfei/quickstart/rna-seq-alignment-star-demo/2'
inputs:
- type:
  - 'null'
  - items: File
    type: array
  label: sjdbGTFfile
  streamable: no
  id: '#sjdbGTFfile'
  sbg:x: 160.4999759
  sbg:y: 195.0833106
  required: no
- type:
  - items: File
    type: array
  label: fastq
  streamable: no
  id: '#fastq'
  sbg:x: 164.2499914
  sbg:y: 323.7499502
  sbg:includeInPorts: yes
  required: yes
- type:
  - File
  label: genomeFastaFiles
  streamable: no
  id: '#genomeFastaFiles'
  sbg:x: 167.7499601
  sbg:y: 469.9999106
  required: yes
- type:
  - 'null'
  - string
  label: Exons' parents name
  description: Tag name to be used as exons’ transcript-parents.
  streamable: no
  id: '#sjdbGTFtagExonParentTranscript'
  sbg:category: Splice junctions db parameters
  sbg:x: 200.0
  sbg:y: 350.0
  sbg:toolDefaultValue: transcript_id
  required: no
- type:
  - 'null'
  - string
  label: Gene name
  description: Tag name to be used as exons’ gene-parents.
  streamable: no
  id: '#sjdbGTFtagExonParentGene'
  sbg:category: Splice junctions db parameters
  sbg:x: 200.0
  sbg:y: 400.0
  sbg:toolDefaultValue: gene_id
  required: no
- type:
  - 'null'
  - int
  label: Max loci anchors
  description: Max number of loci anchors are allowed to map to (int>0).
  streamable: no
  id: '#winAnchorMultimapNmax'
  sbg:category: Windows, Anchors, Binning
  sbg:x: 200.0
  sbg:y: 450.0
  sbg:toolDefaultValue: '50'
  required: no
- type:
  - 'null'
  - int
  label: Max bins between anchors
  description: Max number of bins between two anchors that allows aggregation of anchors
    into one window (int>0).
  streamable: no
  id: '#winAnchorDistNbins'
  sbg:category: Windows, Anchors, Binning
  sbg:x: 200.0
  sbg:y: 500.0
  sbg:toolDefaultValue: '9'
  required: no
outputs:
- type:
  - 'null'
  - items: File
    type: array
  label: unmapped_reads
  streamable: no
  id: '#unmapped_reads'
  source: '#STAR.unmapped_reads'
  sbg:x: 766.2497863
  sbg:y: 159.5833091
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: transcriptome_aligned_reads
  streamable: no
  id: '#transcriptome_aligned_reads'
  source: '#STAR.transcriptome_aligned_reads'
  sbg:x: 1118.9998003
  sbg:y: 86.5833216
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: splice_junctions
  streamable: no
  id: '#splice_junctions'
  source: '#STAR.splice_junctions'
  sbg:x: 1282.3330177
  sbg:y: 167.499976
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: reads_per_gene
  streamable: no
  id: '#reads_per_gene'
  source: '#STAR.reads_per_gene'
  sbg:x: 1394.4163557
  sbg:y: 245.749964
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - items: File
    type: array
  label: log_files
  streamable: no
  id: '#log_files'
  source: '#STAR.log_files'
  sbg:x: 1505.0830269
  sbg:y: 322.9999518
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: chimeric_junctions
  streamable: no
  id: '#chimeric_junctions'
  source: '#STAR.chimeric_junctions'
  sbg:x: 1278.7498062
  sbg:y: 446.7499567
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: intermediate_genome
  streamable: no
  id: '#intermediate_genome'
  source: '#STAR.intermediate_genome'
  sbg:x: 1408.9164783
  sbg:y: 386.0832876
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: chimeric_alignments
  streamable: no
  id: '#chimeric_alignments'
  source: '#STAR.chimeric_alignments'
  sbg:x: 1147.5831348
  sbg:y: 503.2499285
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: sorted_bam
  streamable: no
  id: '#sorted_bam'
  source: '#Picard_SortSam.sorted_bam'
  sbg:x: 934.2498228
  sbg:y: 557.2498436
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: result
  streamable: no
  id: '#result'
  source: '#SBG_FASTQ_Quality_Detector.result'
  sbg:x: 1431.6666548
  sbg:y: 644.9999898
  sbg:includeInPorts: yes
  required: no
requirements:
- class: CreateFileRequirement
  fileDef: []
hints:
- class: sbg:AWSInstanceType
  value: c3.8xlarge
label: RNA-seq Alignment - STAR
description: "Alignment to a reference genome and transcriptome presents the first
  step of RNA-Seq analysis. This pipeline uses STAR, an ultrafast RNA-seq aligner
  capable of mapping full length RNA sequences and detecting de novo canonical junctions,
  non-canonical splices, and chimeric (fusion) transcripts. It is optimized for mammalian
  sequence reads, but fine tuning of its parameters enables customization to satisfy
  unique needs.\n\nSTAR accepts one file per sample (or two files for paired-end data).
  \ \nSplice junction annotations can optionally be collected from splice junction
  databases. Set the \"Overhang length\" parameter to a value larger than zero in
  order to use splice junction databases. For constant read length, this value should
  (ideally) be equal to mate length decreased by 1; for long reads with non-constant
  length, this value should be 100 (pipeline default). \nFastQC Analysis on FASTQ
  files reveals read length distribution. STAR can detect chimeric transcripts, but
  parameter \"Min segment length\" in \"Chimeric Alignments\" category must be adjusted
  to a desired minimum chimeric segment length. Aligned reads are reported in BAM
  format and can be viewed in a genome browser (such as IGV). A file containing detected
  splice junctions is also produced.\n\nUnmapped reads are reported in FASTQ format
  and can be included in an output BAM file. The \"Output unmapped reads\" and \"Write
  unmapped in SAM\" parameters enable unmapped output type selection."
class: Workflow
steps:
- id: '#STAR_Genome_Generate'
  inputs:
  - id: '#STAR_Genome_Generate.sjdbScore'
  - id: '#STAR_Genome_Generate.sjdbOverhang'
  - id: '#STAR_Genome_Generate.sjdbGTFtagExonParentTranscript'
    source: '#sjdbGTFtagExonParentTranscript'
  - id: '#STAR_Genome_Generate.sjdbGTFtagExonParentGene'
    source: '#sjdbGTFtagExonParentGene'
  - id: '#STAR_Genome_Generate.sjdbGTFfile'
    source: '#sjdbGTFfile'
  - id: '#STAR_Genome_Generate.sjdbGTFfeatureExon'
  - id: '#STAR_Genome_Generate.sjdbGTFchrPrefix'
  - id: '#STAR_Genome_Generate.genomeSAsparseD'
  - id: '#STAR_Genome_Generate.genomeSAindexNbases'
  - id: '#STAR_Genome_Generate.genomeFastaFiles'
    source: '#genomeFastaFiles'
  - id: '#STAR_Genome_Generate.genomeChrBinNbits'
  outputs:
  - id: '#STAR_Genome_Generate.genome'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 1
    sbg:job:
      allocatedResources:
        mem: 60000
        cpu: 15
      inputs:
        sjdbScore: 0
        sjdbGTFfeatureExon: sjdbGTFfeatureExon
        sjdbOverhang: 0
        sjdbGTFtagExonParentTranscript: sjdbGTFtagExonParentTranscript
        genomeChrBinNbits: genomeChrBinNbits
        genomeSAsparseD: 0
        sjdbGTFfile:
        - size: 0
          secondaryFiles: []
          class: File
          path: /demo/test-files/chr20.gtf
        sjdbGTFtagExonParentGene: sjdbGTFtagExonParentGene
        genomeFastaFiles:
          size: 0
          secondaryFiles: []
          class: File
          path: /sbgenomics/test-data/chr20.fa
        sjdbGTFchrPrefix: sjdbGTFchrPrefix
        genomeSAindexNbases: 0
    sbg:toolAuthor: Alexander Dobin/CSHL
    sbg:createdOn: 1450911469
    sbg:categories:
    - Alignment
    sbg:contributors:
    - bix-demo
    sbg:links:
    - id: https://github.com/alexdobin/STAR
      label: Homepage
    - id: https://github.com/alexdobin/STAR/releases
      label: Releases
    - id: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf
      label: Manual
    - id: https://groups.google.com/forum/#!forum/rna-star
      label: Support
    - id: http://www.ncbi.nlm.nih.gov/pubmed/23104886
      label: Publication
    sbg:project: bix-demo/star-2-4-2a-demo
    sbg:createdBy: bix-demo
    sbg:toolkitVersion: 2.4.2a
    sbg:id: sevenbridges/public-apps/star-genome-generate/1
    sbg:license: GNU General Public License v3.0 only
    sbg:revision: 1
    sbg:cmdPreview: mkdir genomeDir && /opt/STAR --runMode genomeGenerate --genomeDir
      ./genomeDir --runThreadN 15 --genomeFastaFiles /sbgenomics/test-data/chr20.fa
      --genomeChrBinNbits genomeChrBinNbits --genomeSAindexNbases 0 --genomeSAsparseD
      0 --sjdbGTFfeatureExon sjdbGTFfeatureExon --sjdbGTFtagExonParentTranscript sjdbGTFtagExonParentTranscript
      --sjdbGTFtagExonParentGene sjdbGTFtagExonParentGene --sjdbOverhang 0 --sjdbScore
      0 --sjdbGTFchrPrefix sjdbGTFchrPrefix  --sjdbGTFfile /demo/test-files/chr20.gtf  &&
      tar -vcf genome.tar ./genomeDir /sbgenomics/test-data/chr20.fa
    sbg:modifiedOn: 1450911470
    sbg:modifiedBy: bix-demo
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911469
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911470
      sbg:revision: 1
    sbg:toolkit: STAR
    id: sevenbridges/public-apps/star-genome-generate/1
    inputs:
    - type:
      - 'null'
      - int
      label: Extra alignment score
      description: Extra alignment score for alignments that cross database junctions.
      streamable: no
      id: '#sjdbScore'
      inputBinding:
        position: 0
        prefix: --sjdbScore
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '2'
      required: no
    - type:
      - 'null'
      - int
      label: '"Overhang" length'
      description: Length of the donor/acceptor sequence on each side of the junctions,
        ideally = (mate_length - 1) (int >= 0), if int = 0, splice junction database
        is not used.
      streamable: no
      id: '#sjdbOverhang'
      inputBinding:
        position: 0
        prefix: --sjdbOverhang
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '100'
      required: no
    - type:
      - 'null'
      - string
      label: Exons' parents name
      description: Tag name to be used as exons’ transcript-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentTranscript'
      inputBinding:
        position: 0
        prefix: --sjdbGTFtagExonParentTranscript
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: transcript_id
      required: no
    - type:
      - 'null'
      - string
      label: Gene name
      description: Tag name to be used as exons’ gene-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentGene'
      inputBinding:
        position: 0
        prefix: --sjdbGTFtagExonParentGene
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: gene_id
      required: no
    - type:
      - 'null'
      - items: File
        type: array
      label: Splice junction file
      description: Gene model annotations and/or known transcripts.
      streamable: no
      id: '#sjdbGTFfile'
      sbg:category: Basic
      sbg:fileTypes: GTF, GFF, TXT
      required: no
    - type:
      - 'null'
      - string
      label: Set exons feature
      description: Feature type in GTF file to be used as exons for building transcripts.
      streamable: no
      id: '#sjdbGTFfeatureExon'
      inputBinding:
        position: 0
        prefix: --sjdbGTFfeatureExon
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: exon
      required: no
    - type:
      - 'null'
      - string
      label: Chromosome names
      description: Prefix for chromosome names in a GTF file (e.g. 'chr' for using
        ENSMEBL annotations with UCSC geneomes).
      streamable: no
      id: '#sjdbGTFchrPrefix'
      inputBinding:
        position: 0
        prefix: --sjdbGTFchrPrefix
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - int
      label: Suffux array sparsity
      description: 'Distance between indices: use bigger numbers to decrease needed
        RAM at the cost of mapping speed reduction (int>0).'
      streamable: no
      id: '#genomeSAsparseD'
      inputBinding:
        position: 0
        prefix: --genomeSAsparseD
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Genome generation parameters
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - int
      label: Pre-indexing string length
      description: Length (bases) of the SA pre-indexing string. Typically between
        10 and 15. Longer strings will use much more memory, but allow faster searches.
        For small genomes, this number needs to be scaled down, with a typical value
        of min(14, log2(GenomeLength)/2 - 1). For example, for 1 megaBase genome,
        this is equal to 9, for 100 kiloBase genome, this is equal to 7.
      streamable: no
      id: '#genomeSAindexNbases'
      inputBinding:
        position: 0
        prefix: --genomeSAindexNbases
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Genome generation parameters
      sbg:toolDefaultValue: '14'
      required: no
    - type:
      - File
      label: Genome fasta files
      description: Reference sequence to which to align the reads.
      streamable: no
      id: '#genomeFastaFiles'
      inputBinding:
        position: 0
        prefix: --genomeFastaFiles
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Basic
      sbg:fileTypes: FASTA, FA
      required: yes
    - type:
      - 'null'
      - string
      label: Bins size
      description: 'Set log2(chrBin), where chrBin is the size (bits) of the bins
        for genome storage: each chromosome will occupy an integer number of bins.
        If you are using a genome with a large (>5,000) number of chrosomes/scaffolds,
        you may need to reduce this number to reduce RAM consumption. The following
        scaling is recomended: genomeChrBinNbits = min(18, log2(GenomeLength/NumberOfReferences)).
        For example, for 3 gigaBase genome with 100,000 chromosomes/scaffolds, this
        is equal to 15.'
      streamable: no
      id: '#genomeChrBinNbits'
      inputBinding:
        position: 0
        prefix: --genomeChrBinNbits
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Genome generation parameters
      sbg:toolDefaultValue: '18'
      required: no
    outputs:
    - type:
      - 'null'
      - File
      label: Genome Files
      description: Genome files comprise binary genome sequence, suffix arrays, text
        chromosome names/lengths, splice junctions coordinates, and transcripts/genes
        information.
      streamable: no
      id: '#genome'
      outputBinding:
        glob: '*.tar'
      sbg:fileTypes: TAR
    requirements:
    - class: ExpressionEngineRequirement
      id: '#cwl-js-engine'
      requirements:
      - class: DockerRequirement
        dockerPull: rabix/js-engine
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/ana_d/star:2.4.2a
      dockerImageId: a4b0ad2c3cae
    - class: sbg:CPURequirement
      value: 15
    - class: sbg:MemRequirement
      value: 60000
    label: STAR Genome Generate
    description: STAR Genome Generate is a tool that generates genome index files.
      One set of files should be generated per each genome/annotation combination.
      Once produced, these files could be used as long as genome/annotation combination
      stays the same. Also, STAR Genome Generate which produced these files and STAR
      aligner using them must be the same toolkit version.
    class: CommandLineTool
    arguments:
    - position: 99
      separate: yes
      valueFrom: '&& tar -vcf genome.tar ./genomeDir'
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\t\n  var sjFormat = \"False\"\n  var gtfgffFormat = \"False\"\n
          \ var list = $job.inputs.sjdbGTFfile\n  var paths_list = []\n  var joined_paths
          = \"\"\n  \n  if (list) {\n    list.forEach(function(f){return paths_list.push(f.path)})\n
          \   joined_paths = paths_list.join(\" \")\n\n\n    paths_list.forEach(function(f){\n
          \     ext = f.replace(/^.*\\./, '')\n      if (ext == \"gff\" || ext ==
          \"gtf\") {\n        gtfgffFormat = \"True\"\n        return gtfgffFormat\n
          \     }\n      if (ext == \"txt\") {\n        sjFormat = \"True\"\n        return
          sjFormat\n      }\n    })\n\n    if ($job.inputs.sjdbGTFfile && $job.inputs.sjdbInsertSave
          != \"None\") {\n      if (sjFormat == \"True\") {\n        return \"--sjdbFileChrStartEnd
          \".concat(joined_paths)\n      }\n      else if (gtfgffFormat == \"True\")
          {\n        return \"--sjdbGTFfile \".concat(joined_paths)\n      }\n    }\n
          \ }\n}"
        class: Expression
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 384.0832266
    'y': 446.4998957
  sbg:x: 100.0
  sbg:y: 200.0
- id: '#SBG_FASTQ_Quality_Detector'
  inputs:
  - id: '#SBG_FASTQ_Quality_Detector.fastq'
    source: '#fastq'
  outputs:
  - id: '#SBG_FASTQ_Quality_Detector.result'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 3
    sbg:job:
      allocatedResources:
        mem: 1000
        cpu: 1
      inputs:
        fastq:
          size: 0
          secondaryFiles: []
          class: File
          path: /path/to/fastq.ext
    sbg:toolAuthor: Seven Bridges Genomics
    sbg:createdOn: 1450911312
    sbg:categories:
    - FASTQ-Processing
    sbg:contributors:
    - bix-demo
    sbg:project: bix-demo/sbgtools-demo
    sbg:createdBy: bix-demo
    sbg:id: sevenbridges/public-apps/sbg-fastq-quality-detector/3
    sbg:license: Apache License 2.0
    sbg:revision: 3
    sbg:cmdPreview: python /opt/sbg_fastq_quality_scale_detector.py --fastq /path/to/fastq.ext
      /path/to/fastq.ext
    sbg:modifiedOn: 1450911314
    sbg:modifiedBy: bix-demo
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911312
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911314
      sbg:revision: 3
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911313
      sbg:revision: 1
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911313
      sbg:revision: 2
    sbg:toolkit: SBGTools
    id: sevenbridges/public-apps/sbg-fastq-quality-detector/3
    inputs:
    - type:
      - File
      label: Fastq
      description: FASTQ file.
      streamable: no
      id: '#fastq'
      inputBinding:
        position: 0
        prefix: --fastq
        separate: yes
        sbg:cmdInclude: yes
      required: yes
    outputs:
    - type:
      - 'null'
      - File
      label: Result
      description: Source FASTQ file with updated metadata.
      streamable: no
      id: '#result'
      outputBinding:
        glob: '*.fastq'
      sbg:fileTypes: FASTQ
    requirements:
    - class: CreateFileRequirement
      fileDef: []
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/tziotas/sbg_fastq_quality_scale_detector:1.0
      dockerImageId: ''
    - class: sbg:CPURequirement
      value: 1
    - class: sbg:MemRequirement
      value: 1000
    label: SBG FASTQ Quality Detector
    description: FASTQ Quality Scale Detector detects which quality encoding scheme
      was used in your reads and automatically enters the proper value in the "Quality
      Scale" metadata field.
    class: CommandLineTool
    arguments: []
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 375.3333179
    'y': 323.5833156
  sbg:x: 300.0
  sbg:y: 200.0
- id: '#Picard_SortSam'
  inputs:
  - id: '#Picard_SortSam.validation_stringency'
    default: SILENT
  - id: '#Picard_SortSam.sort_order'
    default: Coordinate
  - id: '#Picard_SortSam.quiet'
  - id: '#Picard_SortSam.output_type'
  - id: '#Picard_SortSam.memory_per_job'
  - id: '#Picard_SortSam.max_records_in_ram'
  - id: '#Picard_SortSam.input_bam'
    source: '#STAR.aligned_reads'
  - id: '#Picard_SortSam.create_index'
    default: 'True'
  - id: '#Picard_SortSam.compression_level'
  outputs:
  - id: '#Picard_SortSam.sorted_bam'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 2
    sbg:job:
      allocatedResources:
        mem: 2048
        cpu: 1
      inputs:
        sort_order: Coordinate
        input_bam:
          path: /root/dir/example.tested.bam
        memory_per_job: 2048
        output_type: ~
        create_index: ~
    sbg:toolAuthor: Broad Institute
    sbg:createdOn: 1450911168
    sbg:categories:
    - SAM/BAM-Processing
    sbg:contributors:
    - bix-demo
    sbg:links:
    - id: http://broadinstitute.github.io/picard/index.html
      label: Homepage
    - id: https://github.com/broadinstitute/picard/releases/tag/1.138
      label: Source Code
    - id: http://broadinstitute.github.io/picard/
      label: Wiki
    - id: https://github.com/broadinstitute/picard/zipball/master
      label: Download
    - id: http://broadinstitute.github.io/picard/
      label: Publication
    sbg:project: bix-demo/picard-1-140-demo
    sbg:createdBy: bix-demo
    sbg:toolkitVersion: '1.140'
    sbg:id: sevenbridges/public-apps/picard-sortsam-1-140/2
    sbg:license: MIT License, Apache 2.0 Licence
    sbg:revision: 2
    sbg:cmdPreview: java -Xmx2048M -jar /opt/picard-tools-1.140/picard.jar SortSam
      OUTPUT=example.tested.sorted.bam INPUT=/root/dir/example.tested.bam SORT_ORDER=coordinate   INPUT=/root/dir/example.tested.bam
      SORT_ORDER=coordinate  /root/dir/example.tested.bam
    sbg:modifiedOn: 1450911170
    sbg:modifiedBy: bix-demo
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911168
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911169
      sbg:revision: 1
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911170
      sbg:revision: 2
    sbg:toolkit: Picard
    id: sevenbridges/public-apps/picard-sortsam-1-140/2
    inputs:
    - type:
      - 'null'
      - name: validation_stringency
        symbols:
        - STRICT
        - LENIENT
        - SILENT
        type: enum
      label: Validation stringency
      description: 'Validation stringency for all SAM files read by this program.
        Setting stringency to SILENT can improve performance when processing a BAM
        file in which variable-length data (read, qualities, tags) do not otherwise
        need to be decoded. This option can be set to ''null'' to clear the default
        value. Possible values: {STRICT, LENIENT, SILENT}.'
      streamable: no
      id: '#validation_stringency'
      inputBinding:
        position: 0
        prefix: VALIDATION_STRINGENCY=
        separate: no
        valueFrom:
          engine: '#cwl-js-engine'
          script: |-
            {
              if ($job.inputs.validation_stringency)
              {
                return $job.inputs.validation_stringency
              }
              else
              {
                return "SILENT"
              }
            }
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: SILENT
      required: no
    - type:
      - name: sort_order
        symbols:
        - Unsorted
        - Queryname
        - Coordinate
        type: enum
      label: Sort order
      description: 'Sort order of the output file. Possible values: {unsorted, queryname,
        coordinate}.'
      streamable: no
      id: '#sort_order'
      inputBinding:
        position: 3
        prefix: SORT_ORDER=
        separate: no
        valueFrom:
          engine: '#cwl-js-engine'
          script: |-
            {
              p = $job.inputs.sort_order.toLowerCase()
              return p
            }
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: Coordinate
      sbg:altPrefix: SO
      required: yes
    - type:
      - 'null'
      - name: quiet
        symbols:
        - 'True'
        - 'False'
        type: enum
      label: Quiet
      description: 'This parameter indicates whether to suppress job-summary info
        on System.err. This option can be set to ''null'' to clear the default value.
        Possible values: {true, false}.'
      streamable: no
      id: '#quiet'
      inputBinding:
        position: 0
        prefix: QUIET=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: 'False'
      required: no
    - type:
      - 'null'
      - name: output_type
        symbols:
        - BAM
        - SAM
        - SAME AS INPUT
        type: enum
      label: Output format
      description: Since Picard tools can output both SAM and BAM files, user can
        choose the format of the output file.
      streamable: no
      id: '#output_type'
      sbg:category: Other input types
      sbg:toolDefaultValue: SAME AS INPUT
      required: no
    - type:
      - 'null'
      - int
      label: Memory per job
      description: Amount of RAM memory to be used per job. Defaults to 2048 MB for
        single threaded jobs.
      streamable: no
      id: '#memory_per_job'
      sbg:toolDefaultValue: '2048'
      required: no
    - type:
      - 'null'
      - int
      label: Max records in RAM
      description: When writing SAM files that need to be sorted, this parameter will
        specify the number of records stored in RAM before spilling to disk. Increasing
        this number reduces the number of file handles needed to sort a SAM file,
        and increases the amount of RAM needed. This option can be set to 'null' to
        clear the default value.
      streamable: no
      id: '#max_records_in_ram'
      inputBinding:
        position: 0
        prefix: MAX_RECORDS_IN_RAM=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: '500000'
      required: no
    - type:
      - File
      label: Input BAM
      description: The BAM or SAM file to sort.
      streamable: no
      id: '#input_bam'
      inputBinding:
        position: 1
        prefix: INPUT=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: File inputs
      sbg:fileTypes: BAM, SAM
      sbg:altPrefix: I
      required: yes
    - type:
      - 'null'
      - name: create_index
        symbols:
        - 'True'
        - 'False'
        type: enum
      label: Create index
      description: 'This parameter indicates whether to create a BAM index when writing
        a coordinate-sorted BAM file. This option can be set to ''null'' to clear
        the default value. Possible values: {true, false}.'
      streamable: no
      id: '#create_index'
      inputBinding:
        position: 5
        prefix: CREATE_INDEX=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: 'False'
      required: no
    - type:
      - 'null'
      - int
      label: Compression level
      description: Compression level for all compressed files created (e.g. BAM and
        GELI). This option can be set to 'null' to clear the default value.
      streamable: no
      id: '#compression_level'
      inputBinding:
        position: 0
        prefix: COMPRESSION_LEVEL=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: '5'
      required: no
    outputs:
    - type:
      - 'null'
      - File
      label: Sorted BAM/SAM
      description: Sorted BAM or SAM file.
      streamable: no
      id: '#sorted_bam'
      outputBinding:
        glob: '*.sorted.?am'
      sbg:fileTypes: BAM, SAM
    requirements:
    - class: ExpressionEngineRequirement
      id: '#cwl-js-engine'
      requirements:
      - class: DockerRequirement
        dockerPull: rabix/js-engine
      engineCommand: cwl-engine.js
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/mladenlsbg/picard:1.140
      dockerImageId: eab0e70b6629
    - class: sbg:CPURequirement
      value: 1
    - class: sbg:MemRequirement
      value:
        engine: '#cwl-js-engine'
        script: "{\n  if($job.inputs.memory_per_job){\n  \treturn $job.inputs.memory_per_job\n
          \ }\n  \treturn 2048\n}"
        class: Expression
    label: Picard SortSam
    description: Picard SortSam sorts the input SAM or BAM. Input and output formats
      are determined by the file extension.
    class: CommandLineTool
    arguments:
    - position: 0
      prefix: OUTPUT=
      separate: no
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  filename = $job.inputs.input_bam.path\n  ext = $job.inputs.output_type\n\nif
          (ext === \"BAM\")\n{\n    return filename.split('.').slice(0, -1).concat(\"sorted.bam\").join(\".\").replace(/^.*[\\\\\\/]/,
          '')\n    }\n\nelse if (ext === \"SAM\")\n{\n    return filename.split('.').slice(0,
          -1).concat(\"sorted.sam\").join('.').replace(/^.*[\\\\\\/]/, '')\n}\n\nelse
          \n{\n\treturn filename.split('.').slice(0, -1).concat(\"sorted.\"+filename.split('.').slice(-1)[0]).join(\".\").replace(/^.*[\\\\\\/]/,
          '')\n}\n}"
        class: Expression
    - position: 1000
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  filename = $job.inputs.input_bam.path\n  \n  /* figuring out
          output file type */\n  ext = $job.inputs.output_type\n  if (ext === \"BAM\")\n
          \ {\n    out_extension = \"BAM\"\n  }\n  else if (ext === \"SAM\")\n  {\n
          \   out_extension = \"SAM\"\n  }\n  else \n  {\n\tout_extension = filename.split('.').slice(-1)[0].toUpperCase()\n
          \ }  \n  \n  /* if exist moving .bai in bam.bai */\n  if ($job.inputs.create_index
          === 'True' && $job.inputs.sort_order === 'Coordinate' && out_extension ==
          \"BAM\")\n  {\n    \n    old_name = filename.split('.').slice(0, -1).concat('sorted.bai').join('.').replace(/^.*[\\\\\\/]/,
          '')\n    new_name = filename.split('.').slice(0, -1).concat('sorted.bam.bai').join('.').replace(/^.*[\\\\\\/]/,
          '')\n    return \"; mv \" + \" \" + old_name + \" \" + new_name\n  }\n\n}"
        class: Expression
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 773.0831807
    'y': 470.9165939
  sbg:x: 500.0
  sbg:y: 200.0
- id: '#STAR'
  inputs:
  - id: '#STAR.winFlankNbins'
  - id: '#STAR.winBinNbits'
  - id: '#STAR.winAnchorMultimapNmax'
    source: '#winAnchorMultimapNmax'
  - id: '#STAR.winAnchorDistNbins'
    source: '#winAnchorDistNbins'
  - id: '#STAR.twopassMode'
  - id: '#STAR.twopass1readsN'
  - id: '#STAR.sjdbScore'
  - id: '#STAR.sjdbOverhang'
    default: 100
  - id: '#STAR.sjdbInsertSave'
  - id: '#STAR.sjdbGTFtagExonParentTranscript'
  - id: '#STAR.sjdbGTFtagExonParentGene'
  - id: '#STAR.sjdbGTFfile'
    source: '#sjdbGTFfile'
  - id: '#STAR.sjdbGTFfeatureExon'
  - id: '#STAR.sjdbGTFchrPrefix'
  - id: '#STAR.seedSearchStartLmaxOverLread'
  - id: '#STAR.seedSearchStartLmax'
  - id: '#STAR.seedSearchLmax'
  - id: '#STAR.seedPerWindowNmax'
  - id: '#STAR.seedPerReadNmax'
  - id: '#STAR.seedNoneLociPerWindow'
  - id: '#STAR.seedMultimapNmax'
  - id: '#STAR.scoreStitchSJshift'
  - id: '#STAR.scoreInsOpen'
  - id: '#STAR.scoreInsBase'
  - id: '#STAR.scoreGenomicLengthLog2scale'
  - id: '#STAR.scoreGapNoncan'
  - id: '#STAR.scoreGapGCAG'
  - id: '#STAR.scoreGapATAC'
  - id: '#STAR.scoreGap'
  - id: '#STAR.scoreDelOpen'
  - id: '#STAR.scoreDelBase'
  - id: '#STAR.rg_seq_center'
  - id: '#STAR.rg_sample_id'
  - id: '#STAR.rg_platform_unit_id'
  - id: '#STAR.rg_platform'
  - id: '#STAR.rg_mfl'
  - id: '#STAR.rg_library_id'
  - id: '#STAR.reads'
    source: '#SBG_FASTQ_Quality_Detector.result'
  - id: '#STAR.readMatesLengthsIn'
  - id: '#STAR.readMapNumber'
  - id: '#STAR.quantTranscriptomeBan'
  - id: '#STAR.quantMode'
    default: TranscriptomeSAM
  - id: '#STAR.outSortingType'
    default: SortedByCoordinate
  - id: '#STAR.outSJfilterReads'
  - id: '#STAR.outSJfilterOverhangMin'
  - id: '#STAR.outSJfilterIntronMaxVsReadN'
  - id: '#STAR.outSJfilterDistToOtherSJmin'
  - id: '#STAR.outSJfilterCountUniqueMin'
  - id: '#STAR.outSJfilterCountTotalMin'
  - id: '#STAR.outSAMunmapped'
  - id: '#STAR.outSAMtype'
    default: BAM
  - id: '#STAR.outSAMstrandField'
  - id: '#STAR.outSAMreadID'
  - id: '#STAR.outSAMprimaryFlag'
  - id: '#STAR.outSAMorder'
  - id: '#STAR.outSAMmode'
  - id: '#STAR.outSAMmapqUnique'
  - id: '#STAR.outSAMheaderPG'
  - id: '#STAR.outSAMheaderHD'
  - id: '#STAR.outSAMflagOR'
  - id: '#STAR.outSAMflagAND'
  - id: '#STAR.outSAMattributes'
  - id: '#STAR.outReadsUnmapped'
    default: Fastx
  - id: '#STAR.outQSconversionAdd'
  - id: '#STAR.outFilterType'
  - id: '#STAR.outFilterScoreMinOverLread'
  - id: '#STAR.outFilterScoreMin'
  - id: '#STAR.outFilterMultimapScoreRange'
  - id: '#STAR.outFilterMultimapNmax'
  - id: '#STAR.outFilterMismatchNoverReadLmax'
  - id: '#STAR.outFilterMismatchNoverLmax'
  - id: '#STAR.outFilterMismatchNmax'
  - id: '#STAR.outFilterMatchNminOverLread'
  - id: '#STAR.outFilterMatchNmin'
  - id: '#STAR.outFilterIntronMotifs'
  - id: '#STAR.limitSjdbInsertNsj'
  - id: '#STAR.limitOutSJoneRead'
  - id: '#STAR.limitOutSJcollapsed'
  - id: '#STAR.limitBAMsortRAM'
  - id: '#STAR.genomeDirName'
  - id: '#STAR.genome'
    source: '#STAR_Genome_Generate.genome'
  - id: '#STAR.clip5pNbases'
  - id: '#STAR.clip3pNbases'
  - id: '#STAR.clip3pAfterAdapterNbases'
  - id: '#STAR.clip3pAdapterSeq'
  - id: '#STAR.clip3pAdapterMMp'
  - id: '#STAR.chimSegmentMin'
  - id: '#STAR.chimScoreSeparation'
  - id: '#STAR.chimScoreMin'
  - id: '#STAR.chimScoreJunctionNonGTAG'
  - id: '#STAR.chimScoreDropMax'
  - id: '#STAR.chimOutType'
  - id: '#STAR.chimJunctionOverhangMin'
  - id: '#STAR.alignWindowsPerReadNmax'
  - id: '#STAR.alignTranscriptsPerWindowNmax'
  - id: '#STAR.alignTranscriptsPerReadNmax'
  - id: '#STAR.alignSplicedMateMapLminOverLmate'
  - id: '#STAR.alignSplicedMateMapLmin'
  - id: '#STAR.alignSoftClipAtReferenceEnds'
  - id: '#STAR.alignSJoverhangMin'
  - id: '#STAR.alignSJDBoverhangMin'
  - id: '#STAR.alignMatesGapMax'
  - id: '#STAR.alignIntronMin'
  - id: '#STAR.alignIntronMax'
  - id: '#STAR.alignEndsType'
  outputs:
  - id: '#STAR.unmapped_reads'
  - id: '#STAR.transcriptome_aligned_reads'
  - id: '#STAR.splice_junctions'
  - id: '#STAR.reads_per_gene'
  - id: '#STAR.log_files'
  - id: '#STAR.intermediate_genome'
  - id: '#STAR.chimeric_junctions'
  - id: '#STAR.chimeric_alignments'
  - id: '#STAR.aligned_reads'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 4
    sbg:job:
      allocatedResources:
        mem: 60000
        cpu: 15
      inputs:
        alignWindowsPerReadNmax: 0
        outSAMheaderPG: outSAMheaderPG
        GENOME_DIR_NAME: ''
        outFilterMatchNminOverLread: 0
        rg_platform_unit_id: rg_platform_unit
        alignTranscriptsPerReadNmax: 0
        readMapNumber: 0
        alignSplicedMateMapLminOverLmate: 0
        alignMatesGapMax: 0
        outFilterMultimapNmax: 0
        clip5pNbases:
        - 0
        outSAMstrandField: None
        readMatesLengthsIn: NotEqual
        outSAMattributes: Standard
        seedMultimapNmax: 0
        rg_mfl: rg_mfl
        chimSegmentMin: 0
        winAnchorDistNbins: 0
        outSortingType: SortedByCoordinate
        outFilterMultimapScoreRange: 0
        sjdbInsertSave: Basic
        clip3pAfterAdapterNbases:
        - 0
        scoreDelBase: 0
        outFilterMatchNmin: 0
        twopass1readsN: 0
        outSAMunmapped: None
        genome:
          size: 0
          secondaryFiles: []
          class: File
          path: genome.ext
        sjdbGTFtagExonParentTranscript: ''
        limitBAMsortRAM: 0
        alignEndsType: Local
        seedNoneLociPerWindow: 0
        rg_sample_id: rg_sample
        sjdbGTFtagExonParentGene: ''
        chimScoreMin: 0
        outSJfilterIntronMaxVsReadN:
        - 0
        twopassMode: Basic
        alignSplicedMateMapLmin: 0
        outSJfilterReads: All
        outSAMprimaryFlag: OneBestScore
        outSJfilterCountTotalMin:
        - 3
        - 1
        - 1
        - 1
        outSAMorder: Paired
        outSAMflagAND: 0
        chimScoreSeparation: 0
        alignSJoverhangMin: 0
        outFilterScoreMin: 0
        seedSearchStartLmax: 0
        scoreGapGCAG: 0
        scoreGenomicLengthLog2scale: 0
        outFilterIntronMotifs: None
        outFilterMismatchNmax: 0
        reads:
        - size: 0
          secondaryFiles: []
          class: File
          metadata:
            format: fastq
            paired_end: '1'
            seq_center: illumina
          path: /test-data/mate_1.fastq.bz2
        scoreGap: 0
        outSJfilterOverhangMin:
        - 30
        - 12
        - 12
        - 12
        outSAMflagOR: 0
        outSAMmode: Full
        rg_library_id: ''
        chimScoreJunctionNonGTAG: 0
        scoreInsOpen: 0
        clip3pAdapterSeq:
        - clip3pAdapterSeq
        chimScoreDropMax: 0
        outFilterType: Normal
        scoreGapATAC: 0
        rg_platform: Ion Torrent PGM
        clip3pAdapterMMp:
        - 0
        sjdbGTFfeatureExon: ''
        outQSconversionAdd: 0
        quantMode: TranscriptomeSAM
        alignIntronMin: 0
        scoreInsBase: 0
        scoreGapNoncan: 0
        seedSearchLmax: 0
        outSJfilterDistToOtherSJmin:
        - 0
        outFilterScoreMinOverLread: 0
        alignSJDBoverhangMin: 0
        limitOutSJcollapsed: 0
        winAnchorMultimapNmax: 0
        outFilterMismatchNoverLmax: 0
        rg_seq_center: ''
        outSAMheaderHD: outSAMheaderHD
        chimOutType: Within
        quantTranscriptomeBan: IndelSoftclipSingleend
        limitOutSJoneRead: 0
        alignTranscriptsPerWindowNmax: 0
        sjdbOverhang: ~
        outReadsUnmapped: Fastx
        scoreStitchSJshift: 0
        seedPerWindowNmax: 0
        outSJfilterCountUniqueMin:
        - 3
        - 1
        - 1
        - 1
        scoreDelOpen: 0
        sjdbGTFfile:
        - path: /demo/test-data/chr20.gtf
        clip3pNbases:
        - 0
        - 3
        winBinNbits: 0
        sjdbScore: ~
        seedSearchStartLmaxOverLread: 0
        alignIntronMax: 0
        seedPerReadNmax: 0
        outFilterMismatchNoverReadLmax: 0
        winFlankNbins: 0
        sjdbGTFchrPrefix: chrPrefix
        alignSoftClipAtReferenceEnds: 'Yes'
        outSAMreadID: Standard
        outSAMtype: BAM
        chimJunctionOverhangMin: 0
        limitSjdbInsertNsj: 0
        outSAMmapqUnique: 0
    sbg:toolAuthor: Alexander Dobin/CSHL
    sbg:createdOn: 1450911471
    sbg:categories:
    - Alignment
    sbg:contributors:
    - ana_d
    - bix-demo
    - uros_sipetic
    sbg:links:
    - id: https://github.com/alexdobin/STAR
      label: Homepage
    - id: https://github.com/alexdobin/STAR/releases
      label: Releases
    - id: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf
      label: Manual
    - id: https://groups.google.com/forum/#!forum/rna-star
      label: Support
    - id: http://www.ncbi.nlm.nih.gov/pubmed/23104886
      label: Publication
    sbg:project: bix-demo/star-2-4-2a-demo
    sbg:createdBy: bix-demo
    sbg:toolkitVersion: 2.4.2a
    sbg:id: sevenbridges/public-apps/star/4
    sbg:license: GNU General Public License v3.0 only
    sbg:revision: 4
    sbg:cmdPreview: tar -xvf genome.ext && /opt/STAR --runThreadN 15  --readFilesCommand
      bzcat  --sjdbGTFfile /demo/test-data/chr20.gtf  --sjdbGTFchrPrefix chrPrefix
      --sjdbInsertSave Basic  --twopass1readsN 0  --chimOutType WithinBAM  --outSAMattrRGline
      ID:1 CN:illumina PI:rg_mfl PL:Ion_Torrent_PGM PU:rg_platform_unit SM:rg_sample  --quantMode
      TranscriptomeSAM --outFileNamePrefix ./mate_1.fastq.bz2.  --readFilesIn /test-data/mate_1.fastq.bz2  &&
      tar -vcf mate_1.fastq.bz2._STARgenome.tar ./mate_1.fastq.bz2._STARgenome  &&
      mv mate_1.fastq.bz2.Unmapped.out.mate1 mate_1.fastq.bz2.Unmapped.out.mate1.fastq
    sbg:modifiedOn: 1462889222
    sbg:modifiedBy: ana_d
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911471
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911473
      sbg:revision: 1
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911475
      sbg:revision: 2
    - sbg:modifiedBy: uros_sipetic
      sbg:modifiedOn: 1462878528
      sbg:revision: 3
    - sbg:modifiedBy: ana_d
      sbg:modifiedOn: 1462889222
      sbg:revision: 4
    sbg:toolkit: STAR
    id: sevenbridges/public-apps/star/4
    inputs:
    - type:
      - 'null'
      - int
      label: Flanking regions size
      description: =log2(winFlank), where win Flank is the size of the left and right
        flanking regions for each window (int>0).
      streamable: no
      id: '#winFlankNbins'
      inputBinding:
        position: 0
        prefix: --winFlankNbins
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '4'
      required: no
    - type:
      - 'null'
      - int
      label: Bin size
      description: =log2(winBin), where winBin is the size of the bin for the windows/clustering,
        each window will occupy an integer number of bins (int>0).
      streamable: no
      id: '#winBinNbits'
      inputBinding:
        position: 0
        prefix: --winBinNbits
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '16'
      required: no
    - type:
      - 'null'
      - int
      label: Max loci anchors
      description: Max number of loci anchors are allowed to map to (int>0).
      streamable: no
      id: '#winAnchorMultimapNmax'
      inputBinding:
        position: 0
        prefix: --winAnchorMultimapNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:toolDefaultValue: '50'
      required: no
    - type:
      - 'null'
      - int
      label: Max bins between anchors
      description: Max number of bins between two anchors that allows aggregation
        of anchors into one window (int>0).
      streamable: no
      id: '#winAnchorDistNbins'
      inputBinding:
        position: 0
        prefix: --winAnchorDistNbins
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:toolDefaultValue: '9'
      required: no
    - type:
      - 'null'
      - name: twopassMode
        symbols:
        - None
        - Basic
        type: enum
      label: Two-pass mode
      description: '2-pass mapping mode. None: 1-pass mapping; Basic: basic 2-pass
        mapping, with all 1st pass junctions inserted into the genome indices on the
        fly.'
      streamable: no
      id: '#twopassMode'
      inputBinding:
        position: 0
        prefix: --twopassMode
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: 2-pass mapping
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - int
      label: Reads to process in 1st step
      description: 'Number of reads to process for the 1st step. 0: 1-step only, no
        2nd pass; use very large number to map all reads in the first step (int>0).'
      streamable: no
      id: '#twopass1readsN'
      sbg:category: 2-pass mapping
      sbg:toolDefaultValue: '-1'
      required: no
    - type:
      - 'null'
      - int
      label: Extra alignment score
      description: Extra alignment score for alignments that cross database junctions.
      streamable: no
      id: '#sjdbScore'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: '2'
      required: no
    - type:
      - 'null'
      - int
      label: '"Overhang" length'
      description: Length of the donor/acceptor sequence on each side of the junctions,
        ideally = (mate_length - 1) (int >= 0), if int = 0, splice junction database
        is not used.
      streamable: no
      id: '#sjdbOverhang'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: '100'
      required: no
    - type:
      - 'null'
      - name: sjdbInsertSave
        symbols:
        - Basic
        - All
        - None
        type: enum
      label: Save junction files
      description: 'Which files to save when sjdb junctions are inserted on the fly
        at the mapping step. None: not saving files at all; Basic: only small junction/transcript
        files; All: all files including big Genome, SA and SAindex. These files are
        output as archive.'
      streamable: no
      id: '#sjdbInsertSave'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - string
      label: Exons' parents name
      description: Tag name to be used as exons’ transcript-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentTranscript'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: transcript_id
      required: no
    - type:
      - 'null'
      - string
      label: Gene name
      description: Tag name to be used as exons’ gene-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentGene'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: gene_id
      required: no
    - type:
      - 'null'
      - items: File
        type: array
      label: Splice junction file
      description: Gene model annotations and/or known transcripts. No need to include
        this input, except in case of using "on the fly" annotations.
      streamable: no
      id: '#sjdbGTFfile'
      sbg:category: Basic
      sbg:fileTypes: GTF, GFF, TXT
      required: no
    - type:
      - 'null'
      - string
      label: Set exons feature
      description: Feature type in GTF file to be used as exons for building transcripts.
      streamable: no
      id: '#sjdbGTFfeatureExon'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: exon
      required: no
    - type:
      - 'null'
      - string
      label: Chromosome names
      description: Prefix for chromosome names in a GTF file (e.g. 'chr' for using
        ENSMEBL annotations with UCSC geneomes).
      streamable: no
      id: '#sjdbGTFchrPrefix'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - float
      label: Search start point normalized
      description: seedSearchStartLmax normalized to read length (sum of mates' lengths
        for paired-end reads).
      streamable: no
      id: '#seedSearchStartLmaxOverLread'
      inputBinding:
        position: 0
        prefix: --seedSearchStartLmaxOverLread
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '1.0'
      required: no
    - type:
      - 'null'
      - int
      label: Search start point
      description: Defines the search start point through the read - the read is split
        into pieces no longer than this value (int>0).
      streamable: no
      id: '#seedSearchStartLmax'
      inputBinding:
        position: 0
        prefix: --seedSearchStartLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '50'
      required: no
    - type:
      - 'null'
      - int
      label: Max seed length
      description: Defines the maximum length of the seeds, if =0 max seed length
        is infinite (int>=0).
      streamable: no
      id: '#seedSearchLmax'
      inputBinding:
        position: 0
        prefix: --seedSearchLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Max seeds per window
      description: Max number of seeds per window (int>=0).
      streamable: no
      id: '#seedPerWindowNmax'
      inputBinding:
        position: 0
        prefix: --seedPerWindowNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '50'
      required: no
    - type:
      - 'null'
      - int
      label: Max seeds per read
      description: Max number of seeds per read (int>=0).
      streamable: no
      id: '#seedPerReadNmax'
      inputBinding:
        position: 0
        prefix: --seedPerReadNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '1000'
      required: no
    - type:
      - 'null'
      - int
      label: Max one-seed loci per window
      description: Max number of one seed loci per window (int>=0).
      streamable: no
      id: '#seedNoneLociPerWindow'
      inputBinding:
        position: 0
        prefix: --seedNoneLociPerWindow
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - int
      label: Filter pieces for stitching
      description: Only pieces that map fewer than this value are utilized in the
        stitching procedure (int>=0).
      streamable: no
      id: '#seedMultimapNmax'
      inputBinding:
        position: 0
        prefix: --seedMultimapNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10000'
      required: no
    - type:
      - 'null'
      - int
      label: Max score reduction
      description: Maximum score reduction while searching for SJ boundaries in the
        stitching step.
      streamable: no
      id: '#scoreStitchSJshift'
      inputBinding:
        position: 0
        prefix: --scoreStitchSJshift
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - int
      label: Insertion Open Penalty
      description: Insertion open penalty.
      streamable: no
      id: '#scoreInsOpen'
      inputBinding:
        position: 0
        prefix: --scoreInsOpen
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - int
      label: Insertion extension penalty
      description: Insertion extension penalty per base (in addition to --scoreInsOpen).
      streamable: no
      id: '#scoreInsBase'
      inputBinding:
        position: 0
        prefix: --scoreInsBase
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - float
      label: Log scaled score
      description: 'Extra score logarithmically scaled with genomic length of the
        alignment: <int>*log2(genomicLength).'
      streamable: no
      id: '#scoreGenomicLengthLog2scale'
      inputBinding:
        position: 0
        prefix: --scoreGenomicLengthLog2scale
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-0.25'
      required: no
    - type:
      - 'null'
      - int
      label: Non-canonical gap open
      description: Non-canonical gap open penalty (in addition to --scoreGap).
      streamable: no
      id: '#scoreGapNoncan'
      inputBinding:
        position: 0
        prefix: --scoreGapNoncan
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-8'
      required: no
    - type:
      - 'null'
      - int
      label: GC/AG and CT/GC gap open
      description: GC/AG and CT/GC gap open penalty (in addition to --scoreGap).
      streamable: no
      id: '#scoreGapGCAG'
      inputBinding:
        position: 0
        prefix: --scoreGapGCAG
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-4'
      required: no
    - type:
      - 'null'
      - int
      label: AT/AC and GT/AT gap open
      description: AT/AC and GT/AT gap open penalty (in addition to --scoreGap).
      streamable: no
      id: '#scoreGapATAC'
      inputBinding:
        position: 0
        prefix: --scoreGapATAC
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-8'
      required: no
    - type:
      - 'null'
      - int
      label: Gap open penalty
      description: Gap open penalty.
      streamable: no
      id: '#scoreGap'
      inputBinding:
        position: 0
        prefix: --scoreGap
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Deletion open penalty
      description: Deletion open penalty.
      streamable: no
      id: '#scoreDelOpen'
      inputBinding:
        position: 0
        prefix: --scoreDelOpen
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - int
      label: Deletion extension penalty
      description: Deletion extension penalty per base (in addition to --scoreDelOpen).
      streamable: no
      id: '#scoreDelBase'
      inputBinding:
        position: 0
        prefix: --scoreDelBase
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - string
      label: Sequencing center
      description: Specify the sequencing center for RG line.
      streamable: no
      id: '#rg_seq_center'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Sample ID
      description: Specify the sample ID for RG line.
      streamable: no
      id: '#rg_sample_id'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Platform unit ID
      description: Specify the platform unit ID for RG line.
      streamable: no
      id: '#rg_platform_unit_id'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - name: rg_platform
        symbols:
        - LS 454
        - Helicos
        - Illumina
        - ABI SOLiD
        - Ion Torrent PGM
        - PacBio
        type: enum
      label: Platform
      description: Specify the version of the technology that was used for sequencing
        or assaying.
      streamable: no
      id: '#rg_platform'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Median fragment length
      description: Specify the median fragment length for RG line.
      streamable: no
      id: '#rg_mfl'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Library ID
      description: Specify the library ID for RG line.
      streamable: no
      id: '#rg_library_id'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - items: File
        type: array
      label: Read sequence
      description: Read sequence.
      streamable: no
      id: '#reads'
      inputBinding:
        position: 10
        separate: yes
        itemSeparator: ' '
        valueFrom:
          engine: '#cwl-js-engine'
          script: "{\t\n  var list = [].concat($job.inputs.reads)\n  \n  var resp
            = []\n  \n  if (list.length == 1){\n    resp.push(list[0].path)\n    \n
            \ }else if (list.length == 2){    \n    \n    left = \"\"\n    right =
            \"\"\n      \n    for (index = 0; index < list.length; ++index) {\n      \n
            \     if (list[index].metadata != null){\n        if (list[index].metadata.paired_end
            == 1){\n          left = list[index].path\n        }else if (list[index].metadata.paired_end
            == 2){\n          right = list[index].path\n        }\n      }\n    }\n
            \   \n    if (left != \"\" && right != \"\"){      \n      resp.push(left)\n
            \     resp.push(right)\n    }\n  }\n  else if (list.length > 2){\n    left
            = []\n    right = []\n      \n    for (index = 0; index < list.length;
            ++index) {\n      \n      if (list[index].metadata != null){\n        if
            (list[index].metadata.paired_end == 1){\n          left.push(list[index].path)\n
            \       }else if (list[index].metadata.paired_end == 2){\n          right.push(list[index].path)\n
            \       }\n      }\n    }\n    left_join = left.join()\n    right_join
            = right.join()\n    if (left != [] && right != []){      \n      resp.push(left_join)\n
            \     resp.push(right_join)\n    }\t\n  }\n  \n  if(resp.length > 0){
            \   \n    return \"--readFilesIn \".concat(resp.join(\" \"))\n  }\n}"
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Basic
      sbg:fileTypes: FASTA, FASTQ, FA, FQ, FASTQ.GZ, FQ.GZ, FASTQ.BZ2, FQ.BZ2
      required: yes
    - type:
      - 'null'
      - name: readMatesLengthsIn
        symbols:
        - NotEqual
        - Equal
        type: enum
      label: Reads lengths
      description: Equal/Not equal - lengths of names, sequences, qualities for both
        mates are the same/not the same. "Not equal" is safe in all situations.
      streamable: no
      id: '#readMatesLengthsIn'
      inputBinding:
        position: 0
        prefix: --readMatesLengthsIn
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: NotEqual
      required: no
    - type:
      - 'null'
      - int
      label: Reads to map
      description: Number of reads to map from the beginning of the file.
      streamable: no
      id: '#readMapNumber'
      inputBinding:
        position: 0
        prefix: --readMapNumber
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '-1'
      required: no
    - type:
      - 'null'
      - name: quantTranscriptomeBan
        symbols:
        - IndelSoftclipSingleend
        - Singleend
        type: enum
      label: Prohibit alignment type
      description: 'Prohibit various alignment type. IndelSoftclipSingleend: prohibit
        indels, soft clipping and single-end alignments - compatible with RSEM; Singleend:
        prohibit single-end alignments.'
      streamable: no
      id: '#quantTranscriptomeBan'
      inputBinding:
        position: 0
        prefix: --quantTranscriptomeBan
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Quantification of Annotations
      sbg:toolDefaultValue: IndelSoftclipSingleend
      required: no
    - type:
      - 'null'
      - name: quantMode
        symbols:
        - TranscriptomeSAM
        - GeneCounts
        type: enum
      label: Quantification mode
      description: Types of quantification requested. 'TranscriptomeSAM' option outputs
        SAM/BAM alignments to transcriptome into a separate file. With 'GeneCounts'
        option, STAR will count number of reads per gene while mapping.
      streamable: no
      id: '#quantMode'
      sbg:category: Quantification of Annotations
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - name: outSortingType
        symbols:
        - Unsorted
        - SortedByCoordinate
        - Unsorted SortedByCoordinate
        type: enum
      label: Output sorting type
      description: Type of output sorting.
      streamable: no
      id: '#outSortingType'
      sbg:category: Output
      sbg:toolDefaultValue: SortedByCoordinate
      required: no
    - type:
      - 'null'
      - name: outSJfilterReads
        symbols:
        - All
        - Unique
        type: enum
      label: Collapsed junctions reads
      description: 'Which reads to consider for collapsed splice junctions output.
        All: all reads, unique- and multi-mappers; Unique: uniquely mapping reads
        only.'
      streamable: no
      id: '#outSJfilterReads'
      inputBinding:
        position: 0
        prefix: --outSJfilterReads
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: All
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min overhang SJ
      description: Minimum overhang length for splice junctions on both sides for
        each of the motifs. To set no output for desired motif, assign -1 to the corresponding
        field. Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterOverhangMin'
      inputBinding:
        position: 0
        prefix: --outSJfilterOverhangMin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 30 12 12 12
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Max gap allowed
      description: 'Maximum gap allowed for junctions supported by 1,2,3...N reads
        (int >= 0) i.e. by default junctions supported by 1 read can have gaps <=50000b,
        by 2 reads: <=100000b, by 3 reads: <=200000. By 4 or more reads: any gap <=alignIntronMax.
        Does not apply to annotated junctions.'
      streamable: no
      id: '#outSJfilterIntronMaxVsReadN'
      inputBinding:
        position: 0
        prefix: --outSJfilterIntronMaxVsReadN
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 50000 100000 200000
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min distance to other donor/acceptor
      description: Minimum allowed distance to other junctions' donor/acceptor for
        each of the motifs (int >= 0). Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterDistToOtherSJmin'
      inputBinding:
        position: 0
        prefix: --outSJfilterDistToOtherSJmin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 10 0 5 10
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min unique count
      description: Minimum uniquely mapping read count per junction for each of the
        motifs. To set no output for desired motif, assign -1 to the corresponding
        field. Junctions are output if one of --outSJfilterCountUniqueMin OR --outSJfilterCountTotalMin
        conditions are satisfied. Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterCountUniqueMin'
      inputBinding:
        position: 0
        prefix: --outSJfilterCountUniqueMin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 3 1 1 1
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min total count
      description: Minimum total (multi-mapping+unique) read count per junction for
        each of the motifs. To set no output for desired motif, assign -1 to the corresponding
        field. Junctions are output if one of --outSJfilterCountUniqueMin OR --outSJfilterCountTotalMin
        conditions are satisfied. Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterCountTotalMin'
      inputBinding:
        position: 0
        prefix: --outSJfilterCountTotalMin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 3 1 1 1
      required: no
    - type:
      - 'null'
      - name: outSAMunmapped
        symbols:
        - None
        - Within
        type: enum
      label: Write unmapped in SAM
      description: 'Output of unmapped reads in the SAM format. None: no output Within:
        output unmapped reads within the main SAM file (i.e. Aligned.out.sam).'
      streamable: no
      id: '#outSAMunmapped'
      inputBinding:
        position: 0
        prefix: --outSAMunmapped
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - name: outSAMtype
        symbols:
        - SAM
        - BAM
        type: enum
      label: Output format
      description: Format of output alignments.
      streamable: no
      id: '#outSAMtype'
      inputBinding:
        position: 0
        separate: yes
        valueFrom:
          engine: '#cwl-js-engine'
          script: |-
            {
              SAM_type = $job.inputs.outSAMtype
              SORT_type = $job.inputs.outSortingType
              if (SAM_type && SORT_type) {
                return "--outSAMtype ".concat(SAM_type, " ", SORT_type)
              }
            }
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: SAM
      required: no
    - type:
      - 'null'
      - name: outSAMstrandField
        symbols:
        - None
        - intronMotif
        type: enum
      label: Strand field flag
      description: 'Cufflinks-like strand field flag. None: not used; intronMotif:
        strand derived from the intron motif. Reads with inconsistent and/or non-canonical
        introns are filtered out.'
      streamable: no
      id: '#outSAMstrandField'
      inputBinding:
        position: 0
        prefix: --outSAMstrandField
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - name: outSAMreadID
        symbols:
        - Standard
        - Number
        type: enum
      label: Read ID
      description: 'Read ID record type. Standard: first word (until space) from the
        FASTx read ID line, removing /1,/2 from the end; Number: read number (index)
        in the FASTx file.'
      streamable: no
      id: '#outSAMreadID'
      inputBinding:
        position: 0
        prefix: --outSAMreadID
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Standard
      required: no
    - type:
      - 'null'
      - name: outSAMprimaryFlag
        symbols:
        - OneBestScore
        - AllBestScore
        type: enum
      label: Primary alignments
      description: 'Which alignments are considered primary - all others will be marked
        with 0x100 bit in the FLAG. OneBestScore: only one alignment with the best
        score is primary; AllBestScore: all alignments with the best score are primary.'
      streamable: no
      id: '#outSAMprimaryFlag'
      inputBinding:
        position: 0
        prefix: --outSAMprimaryFlag
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: OneBestScore
      required: no
    - type:
      - 'null'
      - name: outSAMorder
        symbols:
        - Paired
        - PairedKeepInputOrder
        type: enum
      label: Sorting in SAM
      description: 'Type of sorting for the SAM output. Paired: one mate after the
        other for all paired alignments; PairedKeepInputOrder: one mate after the
        other for all paired alignments, the order is kept the same as in the input
        FASTQ files.'
      streamable: no
      id: '#outSAMorder'
      inputBinding:
        position: 0
        prefix: --outSAMorder
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Paired
      required: no
    - type:
      - 'null'
      - name: outSAMmode
        symbols:
        - Full
        - NoQS
        type: enum
      label: SAM mode
      description: 'Mode of SAM output. Full: full SAM output; NoQS: full SAM but
        without quality scores.'
      streamable: no
      id: '#outSAMmode'
      inputBinding:
        position: 0
        prefix: --outSAMmode
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Full
      required: no
    - type:
      - 'null'
      - int
      label: MAPQ value
      description: MAPQ value for unique mappers (0 to 255).
      streamable: no
      id: '#outSAMmapqUnique'
      inputBinding:
        position: 0
        prefix: --outSAMmapqUnique
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '255'
      required: no
    - type:
      - 'null'
      - string
      label: SAM header @PG
      description: Extra @PG (software) line of the SAM header (in addition to STAR).
      streamable: no
      id: '#outSAMheaderPG'
      inputBinding:
        position: 0
        prefix: --outSAMheaderPG
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - string
      label: SAM header @HD
      description: '@HD (header) line of the SAM header.'
      streamable: no
      id: '#outSAMheaderHD'
      inputBinding:
        position: 0
        prefix: --outSAMheaderHD
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - int
      label: OR SAM flag
      description: Set specific bits of the SAM FLAG.
      streamable: no
      id: '#outSAMflagOR'
      inputBinding:
        position: 0
        prefix: --outSAMflagOR
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: AND SAM flag
      description: Set specific bits of the SAM FLAG.
      streamable: no
      id: '#outSAMflagAND'
      inputBinding:
        position: 0
        prefix: --outSAMflagAND
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '65535'
      required: no
    - type:
      - 'null'
      - name: outSAMattributes
        symbols:
        - Standard
        - NH
        - All
        - None
        type: enum
      label: SAM attributes
      description: 'Desired SAM attributes, in the order desired for the output SAM.
        NH: any combination in any order; Standard: NH HI AS nM; All: NH HI AS nM
        NM MD jM jI; None: no attributes.'
      streamable: no
      id: '#outSAMattributes'
      inputBinding:
        position: 0
        prefix: --outSAMattributes
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Standard
      required: no
    - type:
      - 'null'
      - name: outReadsUnmapped
        symbols:
        - None
        - Fastx
        type: enum
      label: Output unmapped reads
      description: 'Output of unmapped reads (besides SAM). None: no output; Fastx:
        output in separate fasta/fastq files, Unmapped.out.mate1/2.'
      streamable: no
      id: '#outReadsUnmapped'
      inputBinding:
        position: 0
        prefix: --outReadsUnmapped
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - int
      label: Quality conversion
      description: Add this number to the quality score (e.g. to convert from Illumina
        to Sanger, use -31).
      streamable: no
      id: '#outQSconversionAdd'
      inputBinding:
        position: 0
        prefix: --outQSconversionAdd
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: outFilterType
        symbols:
        - Normal
        - BySJout
        type: enum
      label: Filtering type
      description: 'Type of filtering. Normal: standard filtering using only current
        alignment; BySJout: keep only those reads that contain junctions that passed
        filtering into SJ.out.tab.'
      streamable: no
      id: '#outFilterType'
      inputBinding:
        position: 0
        prefix: --outFilterType
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: Normal
      required: no
    - type:
      - 'null'
      - float
      label: Min score normalized
      description: '''Minimum score'' normalized to read length (sum of mates'' lengths
        for paired-end reads).'
      streamable: no
      id: '#outFilterScoreMinOverLread'
      inputBinding:
        position: 0
        prefix: --outFilterScoreMinOverLread
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0.66'
      required: no
    - type:
      - 'null'
      - int
      label: Min score
      description: Alignment will be output only if its score is higher than this
        value.
      streamable: no
      id: '#outFilterScoreMin'
      inputBinding:
        position: 0
        prefix: --outFilterScoreMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Multimapping score range
      description: The score range below the maximum score for multimapping alignments.
      streamable: no
      id: '#outFilterMultimapScoreRange'
      inputBinding:
        position: 0
        prefix: --outFilterMultimapScoreRange
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - int
      label: Max number of mappings
      description: Read alignments will be output only if the read maps fewer than
        this value, otherwise no alignments will be output.
      streamable: no
      id: '#outFilterMultimapNmax'
      inputBinding:
        position: 0
        prefix: --outFilterMultimapNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - float
      label: Mismatches to *read* length
      description: Alignment will be output only if its ratio of mismatches to *read*
        length is less than this value.
      streamable: no
      id: '#outFilterMismatchNoverReadLmax'
      inputBinding:
        position: 0
        prefix: --outFilterMismatchNoverReadLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - float
      label: Mismatches to *mapped* length
      description: Alignment will be output only if its ratio of mismatches to *mapped*
        length is less than this value.
      streamable: no
      id: '#outFilterMismatchNoverLmax'
      inputBinding:
        position: 0
        prefix: --outFilterMismatchNoverLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0.3'
      required: no
    - type:
      - 'null'
      - int
      label: Max number of mismatches
      description: Alignment will be output only if it has fewer mismatches than this
        value.
      streamable: no
      id: '#outFilterMismatchNmax'
      inputBinding:
        position: 0
        prefix: --outFilterMismatchNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - float
      label: Min matched bases normalized
      description: '''Minimum matched bases'' normalized to read length (sum of mates
        lengths for paired-end reads).'
      streamable: no
      id: '#outFilterMatchNminOverLread'
      inputBinding:
        position: 0
        prefix: --outFilterMatchNminOverLread
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0.66'
      required: no
    - type:
      - 'null'
      - int
      label: Min matched bases
      description: Alignment will be output only if the number of matched bases is
        higher than this value.
      streamable: no
      id: '#outFilterMatchNmin'
      inputBinding:
        position: 0
        prefix: --outFilterMatchNmin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: outFilterIntronMotifs
        symbols:
        - None
        - RemoveNoncanonical
        - RemoveNoncanonicalUnannotated
        type: enum
      label: Motifs filtering
      description: 'Filter alignment using their motifs. None: no filtering; RemoveNoncanonical:
        filter out alignments that contain non-canonical junctions; RemoveNoncanonicalUnannotated:
        filter out alignments that contain non-canonical unannotated junctions when
        using annotated splice junctions database. The annotated non-canonical junctions
        will be kept.'
      streamable: no
      id: '#outFilterIntronMotifs'
      inputBinding:
        position: 0
        prefix: --outFilterIntronMotifs
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - int
      label: Max insert junctions
      description: Maximum number of junction to be inserted to the genome on the
        fly at the mapping stage, including those from annotations and those detected
        in the 1st step of the 2-pass run.
      streamable: no
      id: '#limitSjdbInsertNsj'
      inputBinding:
        position: 0
        prefix: --limitSjdbInsertNsj
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '1000000'
      required: no
    - type:
      - 'null'
      - int
      label: Junctions max number
      description: Max number of junctions for one read (including all multi-mappers).
      streamable: no
      id: '#limitOutSJoneRead'
      inputBinding:
        position: 0
        prefix: --limitOutSJoneRead
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '1000'
      required: no
    - type:
      - 'null'
      - int
      label: Collapsed junctions max number
      description: Max number of collapsed junctions.
      streamable: no
      id: '#limitOutSJcollapsed'
      inputBinding:
        position: 0
        prefix: --limitOutSJcollapsed
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '1000000'
      required: no
    - type:
      - 'null'
      - int
      label: Limit BAM sorting memory
      description: Maximum available RAM for sorting BAM. If set to 0, it will be
        set to the genome index size.
      streamable: no
      id: '#limitBAMsortRAM'
      inputBinding:
        position: 0
        prefix: --limitBAMsortRAM
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - string
      label: Genome dir name
      description: Name of the directory which contains genome files (when genome.tar
        is uncompressed).
      streamable: no
      id: '#genomeDirName'
      inputBinding:
        position: 0
        prefix: --genomeDir
        separate: yes
        valueFrom:
          engine: '#cwl-js-engine'
          script: $job.inputs.genomeDirName || "genomeDir"
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Basic
      sbg:toolDefaultValue: genomeDir
      required: no
    - type:
      - File
      label: Genome files
      description: Genome files created using STAR Genome Generate.
      streamable: no
      id: '#genome'
      sbg:category: Basic
      sbg:fileTypes: TAR
      required: yes
    - type:
      - 'null'
      - items: int
        type: array
      label: Clip 5p bases
      description: Number of bases to clip from 5p of each mate. In case only one
        value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip5pNbases'
      inputBinding:
        position: 0
        prefix: --clip5pNbases
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Clip 3p bases
      description: Number of bases to clip from 3p of each mate. In case only one
        value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip3pNbases'
      inputBinding:
        position: 0
        prefix: --clip3pNbases
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Clip 3p after adapter seq.
      description: Number of bases to clip from 3p of each mate after the adapter
        clipping. In case only one value is given, it will be assumed the same for
        both mates.
      streamable: no
      id: '#clip3pAfterAdapterNbases'
      inputBinding:
        position: 0
        prefix: --clip3pAfterAdapterNbases
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - items: string
        type: array
      label: Clip 3p adapter sequence
      description: Adapter sequence to clip from 3p of each mate. In case only one
        value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip3pAdapterSeq'
      inputBinding:
        position: 0
        prefix: --clip3pAdapterSeq
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - items: float
        type: array
      label: Max mismatches proportions
      description: Max proportion of mismatches for 3p adapter clipping for each mate.
        In case only one value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip3pAdapterMMp'
      inputBinding:
        position: 0
        prefix: --clip3pAdapterMMp
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0.1'
      required: no
    - type:
      - 'null'
      - int
      label: Min segment length
      description: Minimum length of chimeric segment length, if =0, no chimeric output
        (int>=0).
      streamable: no
      id: '#chimSegmentMin'
      inputBinding:
        position: 0
        prefix: --chimSegmentMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '15'
      required: no
    - type:
      - 'null'
      - int
      label: Min separation score
      description: Minimum difference (separation) between the best chimeric score
        and the next one (int>=0).
      streamable: no
      id: '#chimScoreSeparation'
      inputBinding:
        position: 0
        prefix: --chimScoreSeparation
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - int
      label: Min total score
      description: Minimum total (summed) score of the chimeric segments (int>=0).
      streamable: no
      id: '#chimScoreMin'
      inputBinding:
        position: 0
        prefix: --chimScoreMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Non-GT/AG penalty
      description: Penalty for a non-GT/AG chimeric junction.
      streamable: no
      id: '#chimScoreJunctionNonGTAG'
      inputBinding:
        position: 0
        prefix: --chimScoreJunctionNonGTAG
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '-1'
      required: no
    - type:
      - 'null'
      - int
      label: Max drop score
      description: Max drop (difference) of chimeric score (the sum of scores of all
        chimeric segements) from the read length (int>=0).
      streamable: no
      id: '#chimScoreDropMax'
      inputBinding:
        position: 0
        prefix: --chimScoreDropMax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '20'
      required: no
    - type:
      - 'null'
      - name: chimOutType
        symbols:
        - SeparateSAMold
        - Within
        type: enum
      label: Chimeric output type
      description: 'Type of chimeric output. SeparateSAMold: output old SAM into separate
        Chimeric.out.sam file; Within: output into main aligned SAM/BAM files.'
      streamable: no
      id: '#chimOutType'
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: SeparateSAMold
      required: no
    - type:
      - 'null'
      - int
      label: Min junction overhang
      description: Minimum overhang for a chimeric junction (int>=0).
      streamable: no
      id: '#chimJunctionOverhangMin'
      inputBinding:
        position: 0
        prefix: --chimJunctionOverhangMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '20'
      required: no
    - type:
      - 'null'
      - float
      label: Max windows per read
      description: Max number of windows per read (int>0).
      streamable: no
      id: '#alignWindowsPerReadNmax'
      inputBinding:
        position: 0
        prefix: --alignWindowsPerReadNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10000'
      required: no
    - type:
      - 'null'
      - int
      label: Max transcripts per window
      description: Max number of transcripts per window (int>0).
      streamable: no
      id: '#alignTranscriptsPerWindowNmax'
      inputBinding:
        position: 0
        prefix: --alignTranscriptsPerWindowNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '100'
      required: no
    - type:
      - 'null'
      - int
      label: Max transcripts per read
      description: Max number of different alignments per read to consider (int>0).
      streamable: no
      id: '#alignTranscriptsPerReadNmax'
      inputBinding:
        position: 0
        prefix: --alignTranscriptsPerReadNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10000'
      required: no
    - type:
      - 'null'
      - float
      label: Min mapped length normalized
      description: alignSplicedMateMapLmin normalized to mate length (float>0).
      streamable: no
      id: '#alignSplicedMateMapLminOverLmate'
      inputBinding:
        position: 0
        prefix: --alignSplicedMateMapLminOverLmate
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0.66'
      required: no
    - type:
      - 'null'
      - int
      label: Min mapped length
      description: Minimum mapped length for a read mate that is spliced (int>0).
      streamable: no
      id: '#alignSplicedMateMapLmin'
      inputBinding:
        position: 0
        prefix: --alignSplicedMateMapLmin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: alignSoftClipAtReferenceEnds
        symbols:
        - 'Yes'
        - 'No'
        type: enum
      label: Soft clipping
      description: 'Option which allows soft clipping of alignments at the reference
        (chromosome) ends. Can be disabled for compatibility with Cufflinks/Cuffmerge.
        Yes: Enables soft clipping; No: Disables soft clipping.'
      streamable: no
      id: '#alignSoftClipAtReferenceEnds'
      inputBinding:
        position: 0
        prefix: --alignSoftClipAtReferenceEnds
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: 'Yes'
      required: no
    - type:
      - 'null'
      - int
      label: Min overhang
      description: Minimum overhang (i.e. block size) for spliced alignments (int>0).
      streamable: no
      id: '#alignSJoverhangMin'
      inputBinding:
        position: 0
        prefix: --alignSJoverhangMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '5'
      required: no
    - type:
      - 'null'
      - int
      label: 'Min overhang: annotated'
      description: Minimum overhang (i.e. block size) for annotated (sjdb) spliced
        alignments (int>0).
      streamable: no
      id: '#alignSJDBoverhangMin'
      inputBinding:
        position: 0
        prefix: --alignSJDBoverhangMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '3'
      required: no
    - type:
      - 'null'
      - int
      label: Max mates gap
      description: Maximum gap between two mates, if 0, max intron gap will be determined
        by (2^winBinNbits)*winAnchorDistNbins.
      streamable: no
      id: '#alignMatesGapMax'
      inputBinding:
        position: 0
        prefix: --alignMatesGapMax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Min intron size
      description: 'Minimum intron size: genomic gap is considered intron if its length
        >= alignIntronMin, otherwise it is considered Deletion (int>=0).'
      streamable: no
      id: '#alignIntronMin'
      inputBinding:
        position: 0
        prefix: --alignIntronMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '21'
      required: no
    - type:
      - 'null'
      - int
      label: Max intron size
      description: Maximum intron size, if 0, max intron size will be determined by
        (2^winBinNbits)*winAnchorDistNbins.
      streamable: no
      id: '#alignIntronMax'
      inputBinding:
        position: 0
        prefix: --alignIntronMax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: alignEndsType
        symbols:
        - Local
        - EndToEnd
        type: enum
      label: Alignment type
      description: 'Type of read ends alignment. Local: standard local alignment with
        soft-clipping allowed. EndToEnd: force end to end read alignment, do not soft-clip.'
      streamable: no
      id: '#alignEndsType'
      inputBinding:
        position: 0
        prefix: --alignEndsType
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: Local
      required: no
    outputs:
    - type:
      - 'null'
      - items: File
        type: array
      label: Unmapped reads
      description: Output of unmapped reads.
      streamable: no
      id: '#unmapped_reads'
      outputBinding:
        glob: '*Unmapped.out*'
      sbg:fileTypes: FASTQ
    - type:
      - 'null'
      - File
      label: Transcriptome alignments
      description: Alignments translated into transcript coordinates.
      streamable: no
      id: '#transcriptome_aligned_reads'
      outputBinding:
        glob: '*Transcriptome*'
      sbg:fileTypes: BAM
    - type:
      - 'null'
      - File
      label: Splice junctions
      description: High confidence collapsed splice junctions in tab-delimited format.
        Only junctions supported by uniquely mapping reads are reported.
      streamable: no
      id: '#splice_junctions'
      outputBinding:
        glob: '*SJ.out.tab'
      sbg:fileTypes: TAB
    - type:
      - 'null'
      - File
      label: Reads per gene
      description: File with number of reads per gene. A read is counted if it overlaps
        (1nt or more) one and only one gene.
      streamable: no
      id: '#reads_per_gene'
      outputBinding:
        glob: '*ReadsPerGene*'
      sbg:fileTypes: TAB
    - type:
      - 'null'
      - items: File
        type: array
      label: Log files
      description: Log files produced during alignment.
      streamable: no
      id: '#log_files'
      outputBinding:
        glob: '*Log*.out'
      sbg:fileTypes: OUT
    - type:
      - 'null'
      - File
      label: Intermediate genome files
      description: Archive with genome files produced when annotations are included
        on the fly (in the mapping step).
      streamable: no
      id: '#intermediate_genome'
      outputBinding:
        glob: '*_STARgenome.tar'
      sbg:fileTypes: TAR
    - type:
      - 'null'
      - File
      label: Chimeric junctions
      description: If chimSegmentMin in 'Chimeric Alignments' section is set to 0,
        'Chimeric Junctions' won't be output.
      streamable: no
      id: '#chimeric_junctions'
      outputBinding:
        glob: '*Chimeric.out.junction'
      sbg:fileTypes: JUNCTION
    - type:
      - 'null'
      - File
      label: Chimeric alignments
      description: Aligned Chimeric sequences SAM - if chimSegmentMin = 0, no Chimeric
        Alignment SAM and Chimeric Junctions outputs.
      streamable: no
      id: '#chimeric_alignments'
      outputBinding:
        glob: '*.Chimeric.out.sam'
      sbg:fileTypes: SAM
    - type:
      - 'null'
      - File
      label: Aligned SAM/BAM
      description: Aligned sequence in SAM/BAM format.
      streamable: no
      id: '#aligned_reads'
      outputBinding:
        glob:
          engine: '#cwl-js-engine'
          script: |-
            {
              if ($job.inputs.outSortingType == 'SortedByCoordinate') {
                sort_name = '.sortedByCoord'
              }
              else {
                sort_name = ''
              }
              if ($job.inputs.outSAMtype == 'BAM') {
                sam_name = "*.Aligned".concat( sort_name, '.out.bam')
              }
              else {
                sam_name = "*.Aligned.out.sam"
              }
              return sam_name
            }
          class: Expression
      sbg:fileTypes: SAM, BAM
    requirements:
    - class: ExpressionEngineRequirement
      id: '#cwl-js-engine'
      requirements:
      - class: DockerRequirement
        dockerPull: rabix/js-engine
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/ana_d/star:2.4.2a
      dockerImageId: a4b0ad2c3cae
    - class: sbg:MemRequirement
      value: 60000
    - class: sbg:CPURequirement
      value: 15
    label: STAR
    description: STAR is an ultrafast universal RNA-seq aligner. It has very high
      mapping speed, accurate alignment of contiguous and spliced reads, detection
      of polyA-tails, non-canonical splices and chimeric (fusion) junctions. It works
      with reads starting from lengths ~15 bases up to ~300 bases. In case of having
      longer reads, use of STAR Long is recommended.
    class: CommandLineTool
    arguments:
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            file = [].concat($job.inputs.reads)[0].path
            extension = /(?:\.([^.]+))?$/.exec(file)[1]
            if (extension == "gz") {
              return "--readFilesCommand zcat"
            } else if (extension == "bz2") {
              return "--readFilesCommand bzcat"
            }
          }
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\t\n  var sjFormat = \"False\"\n  var gtfgffFormat = \"False\"\n
          \ var list = $job.inputs.sjdbGTFfile\n  var paths_list = []\n  var joined_paths
          = \"\"\n  \n  if (list) {\n    list.forEach(function(f){return paths_list.push(f.path)})\n
          \   joined_paths = paths_list.join(\" \")\n\n\n    paths_list.forEach(function(f){\n
          \     ext = f.replace(/^.*\\./, '')\n      if (ext == \"gff\" || ext ==
          \"gtf\") {\n        gtfgffFormat = \"True\"\n        return gtfgffFormat\n
          \     }\n      if (ext == \"txt\") {\n        sjFormat = \"True\"\n        return
          sjFormat\n      }\n    })\n\n    if ($job.inputs.sjdbGTFfile && $job.inputs.sjdbInsertSave
          != \"None\") {\n      if (sjFormat == \"True\") {\n        return \"--sjdbFileChrStartEnd
          \".concat(joined_paths)\n      }\n      else if (gtfgffFormat == \"True\")
          {\n        return \"--sjdbGTFfile \".concat(joined_paths)\n      }\n    }\n
          \ }\n}"
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  a = b = c = d = e = f = g = []\n  if ($job.inputs.sjdbGTFchrPrefix)
          {\n    a = [\"--sjdbGTFchrPrefix\", $job.inputs.sjdbGTFchrPrefix]\n  }\n
          \ if ($job.inputs.sjdbGTFfeatureExon) {\n    b = [\"--sjdbGTFfeatureExon\",
          $job.inputs.sjdbGTFfeatureExon]\n  }\n  if ($job.inputs.sjdbGTFtagExonParentTranscript)
          {\n    c = [\"--sjdbGTFtagExonParentTranscript\", $job.inputs.sjdbGTFtagExonParentTranscript]\n
          \ }\n  if ($job.inputs.sjdbGTFtagExonParentGene) {\n    d = [\"--sjdbGTFtagExonParentGene\",
          $job.inputs.sjdbGTFtagExonParentGene]\n  }\n  if ($job.inputs.sjdbOverhang)
          {\n    e = [\"--sjdbOverhang\", $job.inputs.sjdbOverhang]\n  }\n  if ($job.inputs.sjdbScore)
          {\n    f = [\"--sjdbScore\", $job.inputs.sjdbScore]\n  }\n  if ($job.inputs.sjdbInsertSave)
          {\n    g = [\"--sjdbInsertSave\", $job.inputs.sjdbInsertSave]\n  }\n  \n
          \ \n  \n  if ($job.inputs.sjdbInsertSave != \"None\" && $job.inputs.sjdbGTFfile)
          {\n    new_list = a.concat(b, c, d, e, f, g)\n    return new_list.join(\"
          \")\n  }\n}"
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            if ($job.inputs.twopassMode == "Basic") {
              return "--twopass1readsN ".concat($job.inputs.twopass1readsN)
            }
          }
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            if ($job.inputs.chimOutType == "Within") {
              return "--chimOutType ".concat("Within", $job.inputs.outSAMtype)
            }
            else {
              return "--chimOutType SeparateSAMold"
            }
          }
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  var param_list = []\n  \n  function add_param(key, value){\n
          \   if (value == \"\") {\n      return\n    }\n    else {\n      return
          param_list.push(key.concat(\":\", value))\n    }\n  }\n  \n  add_param('ID',
          \"1\")\n  if ($job.inputs.rg_seq_center) {\n    add_param('CN', $job.inputs.rg_seq_center)\n
          \ } else if ([].concat($job.inputs.reads)[0].metadata.seq_center) {\n    add_param('CN',
          [].concat($job.inputs.reads)[0].metadata.seq_center)\n  }\n  if ($job.inputs.rg_library_id)
          {\n    add_param('LB', $job.inputs.rg_library_id)\n  } else if ([].concat($job.inputs.reads)[0].metadata.library_id)
          {\n    add_param('LB', [].concat($job.inputs.reads)[0].metadata.library_id)\n
          \ }\n  if ($job.inputs.rg_mfl) {\n    add_param('PI', $job.inputs.rg_mfl)\n
          \ } else if ([].concat($job.inputs.reads)[0].metadata.median_fragment_length)
          {\n    add_param('PI', [].concat($job.inputs.reads)[0].metadata.median_fragment_length)\n
          \ }\n  if ($job.inputs.rg_platform) {\n    add_param('PL', $job.inputs.rg_platform.replace(/
          /g,\"_\"))\n  } else if ([].concat($job.inputs.reads)[0].metadata.platform)
          {\n    add_param('PL', [].concat($job.inputs.reads)[0].metadata.platform.replace(/
          /g,\"_\"))\n  }\n  if ($job.inputs.rg_platform_unit_id) {\n    add_param('PU',
          $job.inputs.rg_platform_unit_id)\n  } else if ([].concat($job.inputs.reads)[0].metadata.platform_unit_id)
          {\n    add_param('PU', [].concat($job.inputs.reads)[0].metadata.platform_unit_id)\n
          \ }\n  if ($job.inputs.rg_sample_id) {\n    add_param('SM', $job.inputs.rg_sample_id)\n
          \ } else if ([].concat($job.inputs.reads)[0].metadata.sample_id) {\n    add_param('SM',
          [].concat($job.inputs.reads)[0].metadata.sample_id)\n  }\n  return \"--outSAMattrRGline
          \".concat(param_list.join(\" \"))\n}"
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            if ($job.inputs.sjdbGTFfile && $job.inputs.quantMode) {
              return "--quantMode ".concat($job.inputs.quantMode)
            }
          }
        class: Expression
    - position: 100
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  function sharedStart(array){\n  var A= array.concat().sort(),
          \n      a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;\n  while(i<L &&
          a1.charAt(i)=== a2.charAt(i)) i++;\n  return a1.substring(0, i);\n  }\n
          \ path_list = []\n  arr = [].concat($job.inputs.reads)\n  arr.forEach(function(f){return
          path_list.push(f.path.replace(/\\\\/g,'/').replace( /.*\\//, '' ))})\n  common_prefix
          = sharedStart(path_list)\n  intermediate = common_prefix.replace( /\\-$|\\_$|\\.$/,
          '' ).concat(\"._STARgenome\")\n  source = \"./\".concat(intermediate)\n
          \ destination = intermediate.concat(\".tar\")\n  if ($job.inputs.sjdbGTFfile
          && $job.inputs.sjdbInsertSave && $job.inputs.sjdbInsertSave != \"None\")
          {\n    return \"&& tar -vcf \".concat(destination, \" \", source)\n  }\n}"
        class: Expression
    - position: 0
      prefix: --outFileNamePrefix
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  function sharedStart(array){\n  var A= array.concat().sort(),
          \n      a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;\n  while(i<L &&
          a1.charAt(i)=== a2.charAt(i)) i++;\n  return a1.substring(0, i);\n  }\n
          \ path_list = []\n  arr = [].concat($job.inputs.reads)\n  arr.forEach(function(f){return
          path_list.push(f.path.replace(/\\\\/g,'/').replace( /.*\\//, '' ))})\n  common_prefix
          = sharedStart(path_list)\n  return \"./\".concat(common_prefix.replace(
          /\\-$|\\_$|\\.$/, '' ), \".\")\n}"
        class: Expression
    - position: 101
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  function sharedStart(array){\n  var A= array.concat().sort(),
          \n      a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;\n  while(i<L &&
          a1.charAt(i)=== a2.charAt(i)) i++;\n  return a1.substring(0, i);\n  }\n
          \ path_list = []\n  arr = [].concat($job.inputs.reads)\n  arr.forEach(function(f){return
          path_list.push(f.path.replace(/\\\\/g,'/').replace( /.*\\//, '' ))})\n  common_prefix
          = sharedStart(path_list)\n  mate1 = common_prefix.replace( /\\-$|\\_$|\\.$/,
          '' ).concat(\".Unmapped.out.mate1\")\n  mate2 = common_prefix.replace( /\\-$|\\_$|\\.$/,
          '' ).concat(\".Unmapped.out.mate2\")\n  mate1fq = mate1.concat(\".fastq\")\n
          \ mate2fq = mate2.concat(\".fastq\")\n  if ($job.inputs.outReadsUnmapped
          == \"Fastx\" && arr.length > 1) {\n    return \"&& mv \".concat(mate1, \"
          \", mate1fq, \" && mv \", mate2, \" \", mate2fq)\n  }\n  else if ($job.inputs.outReadsUnmapped
          == \"Fastx\" && arr.length == 1) {\n    return \"&& mv \".concat(mate1,
          \" \", mate1fq)\n  }\n}"
        class: Expression
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 624.0
    'y': 323
  sbg:x: 700.0
  sbg:y: 200.0
sbg:canvas_zoom: 0.6
sbg:canvas_y: -16
sbg:canvas_x: -41
sbg:batchInput: '#sjdbGTFfile'
sbg:batchBy:
  type: item

Batch by other critieria such as metadta, following example, is using sample_id and library_id

f1 = system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 = convert_app(f1)
f1$set_batch("sjdbGTFfile", c("metadata.sample_id", "metadata.library_id"))
criteria provided, convert type from ITEM to CRITERIA
sbg:validationErrors: []
sbg:sbgMaintained: no
sbg:latestRevision: 2
sbg:toolAuthor: Seven Bridges Genomics
sbg:createdOn: 1463601910
sbg:categories:
- Alignment
- RNA
sbg:contributors:
- tengfei
sbg:project: tengfei/quickstart
sbg:createdBy: tengfei
sbg:toolkitVersion: 2.4.2a
sbg:id: tengfei/quickstart/rna-seq-alignment-star-demo/2
sbg:license: Apache License 2.0
sbg:revision: 2
sbg:modifiedOn: 1463601974
sbg:modifiedBy: tengfei
sbg:revisionsInfo:
- sbg:modifiedBy: tengfei
  sbg:modifiedOn: 1463601910
  sbg:revision: 0
- sbg:modifiedBy: tengfei
  sbg:modifiedOn: 1463601952
  sbg:revision: 1
- sbg:modifiedBy: tengfei
  sbg:modifiedOn: 1463601974
  sbg:revision: 2
sbg:toolkit: STAR
id: '#tengfei/quickstart/rna-seq-alignment-star-demo/2'
inputs:
- type:
  - 'null'
  - items: File
    type: array
  label: sjdbGTFfile
  streamable: no
  id: '#sjdbGTFfile'
  sbg:x: 160.4999759
  sbg:y: 195.0833106
  required: no
  batchType: metadata.library_id
- type:
  - items: File
    type: array
  label: fastq
  streamable: no
  id: '#fastq'
  sbg:x: 164.2499914
  sbg:y: 323.7499502
  sbg:includeInPorts: yes
  required: yes
- type:
  - File
  label: genomeFastaFiles
  streamable: no
  id: '#genomeFastaFiles'
  sbg:x: 167.7499601
  sbg:y: 469.9999106
  required: yes
- type:
  - 'null'
  - string
  label: Exons' parents name
  description: Tag name to be used as exons’ transcript-parents.
  streamable: no
  id: '#sjdbGTFtagExonParentTranscript'
  sbg:category: Splice junctions db parameters
  sbg:x: 200.0
  sbg:y: 350.0
  sbg:toolDefaultValue: transcript_id
  required: no
- type:
  - 'null'
  - string
  label: Gene name
  description: Tag name to be used as exons’ gene-parents.
  streamable: no
  id: '#sjdbGTFtagExonParentGene'
  sbg:category: Splice junctions db parameters
  sbg:x: 200.0
  sbg:y: 400.0
  sbg:toolDefaultValue: gene_id
  required: no
- type:
  - 'null'
  - int
  label: Max loci anchors
  description: Max number of loci anchors are allowed to map to (int>0).
  streamable: no
  id: '#winAnchorMultimapNmax'
  sbg:category: Windows, Anchors, Binning
  sbg:x: 200.0
  sbg:y: 450.0
  sbg:toolDefaultValue: '50'
  required: no
- type:
  - 'null'
  - int
  label: Max bins between anchors
  description: Max number of bins between two anchors that allows aggregation of anchors
    into one window (int>0).
  streamable: no
  id: '#winAnchorDistNbins'
  sbg:category: Windows, Anchors, Binning
  sbg:x: 200.0
  sbg:y: 500.0
  sbg:toolDefaultValue: '9'
  required: no
outputs:
- type:
  - 'null'
  - items: File
    type: array
  label: unmapped_reads
  streamable: no
  id: '#unmapped_reads'
  source: '#STAR.unmapped_reads'
  sbg:x: 766.2497863
  sbg:y: 159.5833091
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: transcriptome_aligned_reads
  streamable: no
  id: '#transcriptome_aligned_reads'
  source: '#STAR.transcriptome_aligned_reads'
  sbg:x: 1118.9998003
  sbg:y: 86.5833216
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: splice_junctions
  streamable: no
  id: '#splice_junctions'
  source: '#STAR.splice_junctions'
  sbg:x: 1282.3330177
  sbg:y: 167.499976
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: reads_per_gene
  streamable: no
  id: '#reads_per_gene'
  source: '#STAR.reads_per_gene'
  sbg:x: 1394.4163557
  sbg:y: 245.749964
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - items: File
    type: array
  label: log_files
  streamable: no
  id: '#log_files'
  source: '#STAR.log_files'
  sbg:x: 1505.0830269
  sbg:y: 322.9999518
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: chimeric_junctions
  streamable: no
  id: '#chimeric_junctions'
  source: '#STAR.chimeric_junctions'
  sbg:x: 1278.7498062
  sbg:y: 446.7499567
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: intermediate_genome
  streamable: no
  id: '#intermediate_genome'
  source: '#STAR.intermediate_genome'
  sbg:x: 1408.9164783
  sbg:y: 386.0832876
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: chimeric_alignments
  streamable: no
  id: '#chimeric_alignments'
  source: '#STAR.chimeric_alignments'
  sbg:x: 1147.5831348
  sbg:y: 503.2499285
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: sorted_bam
  streamable: no
  id: '#sorted_bam'
  source: '#Picard_SortSam.sorted_bam'
  sbg:x: 934.2498228
  sbg:y: 557.2498436
  sbg:includeInPorts: yes
  required: no
- type:
  - 'null'
  - File
  label: result
  streamable: no
  id: '#result'
  source: '#SBG_FASTQ_Quality_Detector.result'
  sbg:x: 1431.6666548
  sbg:y: 644.9999898
  sbg:includeInPorts: yes
  required: no
requirements:
- class: CreateFileRequirement
  fileDef: []
hints:
- class: sbg:AWSInstanceType
  value: c3.8xlarge
label: RNA-seq Alignment - STAR
description: "Alignment to a reference genome and transcriptome presents the first
  step of RNA-Seq analysis. This pipeline uses STAR, an ultrafast RNA-seq aligner
  capable of mapping full length RNA sequences and detecting de novo canonical junctions,
  non-canonical splices, and chimeric (fusion) transcripts. It is optimized for mammalian
  sequence reads, but fine tuning of its parameters enables customization to satisfy
  unique needs.\n\nSTAR accepts one file per sample (or two files for paired-end data).
  \ \nSplice junction annotations can optionally be collected from splice junction
  databases. Set the \"Overhang length\" parameter to a value larger than zero in
  order to use splice junction databases. For constant read length, this value should
  (ideally) be equal to mate length decreased by 1; for long reads with non-constant
  length, this value should be 100 (pipeline default). \nFastQC Analysis on FASTQ
  files reveals read length distribution. STAR can detect chimeric transcripts, but
  parameter \"Min segment length\" in \"Chimeric Alignments\" category must be adjusted
  to a desired minimum chimeric segment length. Aligned reads are reported in BAM
  format and can be viewed in a genome browser (such as IGV). A file containing detected
  splice junctions is also produced.\n\nUnmapped reads are reported in FASTQ format
  and can be included in an output BAM file. The \"Output unmapped reads\" and \"Write
  unmapped in SAM\" parameters enable unmapped output type selection."
class: Workflow
steps:
- id: '#STAR_Genome_Generate'
  inputs:
  - id: '#STAR_Genome_Generate.sjdbScore'
  - id: '#STAR_Genome_Generate.sjdbOverhang'
  - id: '#STAR_Genome_Generate.sjdbGTFtagExonParentTranscript'
    source: '#sjdbGTFtagExonParentTranscript'
  - id: '#STAR_Genome_Generate.sjdbGTFtagExonParentGene'
    source: '#sjdbGTFtagExonParentGene'
  - id: '#STAR_Genome_Generate.sjdbGTFfile'
    source: '#sjdbGTFfile'
  - id: '#STAR_Genome_Generate.sjdbGTFfeatureExon'
  - id: '#STAR_Genome_Generate.sjdbGTFchrPrefix'
  - id: '#STAR_Genome_Generate.genomeSAsparseD'
  - id: '#STAR_Genome_Generate.genomeSAindexNbases'
  - id: '#STAR_Genome_Generate.genomeFastaFiles'
    source: '#genomeFastaFiles'
  - id: '#STAR_Genome_Generate.genomeChrBinNbits'
  outputs:
  - id: '#STAR_Genome_Generate.genome'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 1
    sbg:job:
      allocatedResources:
        mem: 60000
        cpu: 15
      inputs:
        sjdbScore: 0
        sjdbGTFfeatureExon: sjdbGTFfeatureExon
        sjdbOverhang: 0
        sjdbGTFtagExonParentTranscript: sjdbGTFtagExonParentTranscript
        genomeChrBinNbits: genomeChrBinNbits
        genomeSAsparseD: 0
        sjdbGTFfile:
        - size: 0
          secondaryFiles: []
          class: File
          path: /demo/test-files/chr20.gtf
        sjdbGTFtagExonParentGene: sjdbGTFtagExonParentGene
        genomeFastaFiles:
          size: 0
          secondaryFiles: []
          class: File
          path: /sbgenomics/test-data/chr20.fa
        sjdbGTFchrPrefix: sjdbGTFchrPrefix
        genomeSAindexNbases: 0
    sbg:toolAuthor: Alexander Dobin/CSHL
    sbg:createdOn: 1450911469
    sbg:categories:
    - Alignment
    sbg:contributors:
    - bix-demo
    sbg:links:
    - id: https://github.com/alexdobin/STAR
      label: Homepage
    - id: https://github.com/alexdobin/STAR/releases
      label: Releases
    - id: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf
      label: Manual
    - id: https://groups.google.com/forum/#!forum/rna-star
      label: Support
    - id: http://www.ncbi.nlm.nih.gov/pubmed/23104886
      label: Publication
    sbg:project: bix-demo/star-2-4-2a-demo
    sbg:createdBy: bix-demo
    sbg:toolkitVersion: 2.4.2a
    sbg:id: sevenbridges/public-apps/star-genome-generate/1
    sbg:license: GNU General Public License v3.0 only
    sbg:revision: 1
    sbg:cmdPreview: mkdir genomeDir && /opt/STAR --runMode genomeGenerate --genomeDir
      ./genomeDir --runThreadN 15 --genomeFastaFiles /sbgenomics/test-data/chr20.fa
      --genomeChrBinNbits genomeChrBinNbits --genomeSAindexNbases 0 --genomeSAsparseD
      0 --sjdbGTFfeatureExon sjdbGTFfeatureExon --sjdbGTFtagExonParentTranscript sjdbGTFtagExonParentTranscript
      --sjdbGTFtagExonParentGene sjdbGTFtagExonParentGene --sjdbOverhang 0 --sjdbScore
      0 --sjdbGTFchrPrefix sjdbGTFchrPrefix  --sjdbGTFfile /demo/test-files/chr20.gtf  &&
      tar -vcf genome.tar ./genomeDir /sbgenomics/test-data/chr20.fa
    sbg:modifiedOn: 1450911470
    sbg:modifiedBy: bix-demo
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911469
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911470
      sbg:revision: 1
    sbg:toolkit: STAR
    id: sevenbridges/public-apps/star-genome-generate/1
    inputs:
    - type:
      - 'null'
      - int
      label: Extra alignment score
      description: Extra alignment score for alignments that cross database junctions.
      streamable: no
      id: '#sjdbScore'
      inputBinding:
        position: 0
        prefix: --sjdbScore
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '2'
      required: no
    - type:
      - 'null'
      - int
      label: '"Overhang" length'
      description: Length of the donor/acceptor sequence on each side of the junctions,
        ideally = (mate_length - 1) (int >= 0), if int = 0, splice junction database
        is not used.
      streamable: no
      id: '#sjdbOverhang'
      inputBinding:
        position: 0
        prefix: --sjdbOverhang
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '100'
      required: no
    - type:
      - 'null'
      - string
      label: Exons' parents name
      description: Tag name to be used as exons’ transcript-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentTranscript'
      inputBinding:
        position: 0
        prefix: --sjdbGTFtagExonParentTranscript
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: transcript_id
      required: no
    - type:
      - 'null'
      - string
      label: Gene name
      description: Tag name to be used as exons’ gene-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentGene'
      inputBinding:
        position: 0
        prefix: --sjdbGTFtagExonParentGene
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: gene_id
      required: no
    - type:
      - 'null'
      - items: File
        type: array
      label: Splice junction file
      description: Gene model annotations and/or known transcripts.
      streamable: no
      id: '#sjdbGTFfile'
      sbg:category: Basic
      sbg:fileTypes: GTF, GFF, TXT
      required: no
    - type:
      - 'null'
      - string
      label: Set exons feature
      description: Feature type in GTF file to be used as exons for building transcripts.
      streamable: no
      id: '#sjdbGTFfeatureExon'
      inputBinding:
        position: 0
        prefix: --sjdbGTFfeatureExon
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: exon
      required: no
    - type:
      - 'null'
      - string
      label: Chromosome names
      description: Prefix for chromosome names in a GTF file (e.g. 'chr' for using
        ENSMEBL annotations with UCSC geneomes).
      streamable: no
      id: '#sjdbGTFchrPrefix'
      inputBinding:
        position: 0
        prefix: --sjdbGTFchrPrefix
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Splice junctions db parameters
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - int
      label: Suffux array sparsity
      description: 'Distance between indices: use bigger numbers to decrease needed
        RAM at the cost of mapping speed reduction (int>0).'
      streamable: no
      id: '#genomeSAsparseD'
      inputBinding:
        position: 0
        prefix: --genomeSAsparseD
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Genome generation parameters
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - int
      label: Pre-indexing string length
      description: Length (bases) of the SA pre-indexing string. Typically between
        10 and 15. Longer strings will use much more memory, but allow faster searches.
        For small genomes, this number needs to be scaled down, with a typical value
        of min(14, log2(GenomeLength)/2 - 1). For example, for 1 megaBase genome,
        this is equal to 9, for 100 kiloBase genome, this is equal to 7.
      streamable: no
      id: '#genomeSAindexNbases'
      inputBinding:
        position: 0
        prefix: --genomeSAindexNbases
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Genome generation parameters
      sbg:toolDefaultValue: '14'
      required: no
    - type:
      - File
      label: Genome fasta files
      description: Reference sequence to which to align the reads.
      streamable: no
      id: '#genomeFastaFiles'
      inputBinding:
        position: 0
        prefix: --genomeFastaFiles
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Basic
      sbg:fileTypes: FASTA, FA
      required: yes
    - type:
      - 'null'
      - string
      label: Bins size
      description: 'Set log2(chrBin), where chrBin is the size (bits) of the bins
        for genome storage: each chromosome will occupy an integer number of bins.
        If you are using a genome with a large (>5,000) number of chrosomes/scaffolds,
        you may need to reduce this number to reduce RAM consumption. The following
        scaling is recomended: genomeChrBinNbits = min(18, log2(GenomeLength/NumberOfReferences)).
        For example, for 3 gigaBase genome with 100,000 chromosomes/scaffolds, this
        is equal to 15.'
      streamable: no
      id: '#genomeChrBinNbits'
      inputBinding:
        position: 0
        prefix: --genomeChrBinNbits
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Genome generation parameters
      sbg:toolDefaultValue: '18'
      required: no
    outputs:
    - type:
      - 'null'
      - File
      label: Genome Files
      description: Genome files comprise binary genome sequence, suffix arrays, text
        chromosome names/lengths, splice junctions coordinates, and transcripts/genes
        information.
      streamable: no
      id: '#genome'
      outputBinding:
        glob: '*.tar'
      sbg:fileTypes: TAR
    requirements:
    - class: ExpressionEngineRequirement
      id: '#cwl-js-engine'
      requirements:
      - class: DockerRequirement
        dockerPull: rabix/js-engine
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/ana_d/star:2.4.2a
      dockerImageId: a4b0ad2c3cae
    - class: sbg:CPURequirement
      value: 15
    - class: sbg:MemRequirement
      value: 60000
    label: STAR Genome Generate
    description: STAR Genome Generate is a tool that generates genome index files.
      One set of files should be generated per each genome/annotation combination.
      Once produced, these files could be used as long as genome/annotation combination
      stays the same. Also, STAR Genome Generate which produced these files and STAR
      aligner using them must be the same toolkit version.
    class: CommandLineTool
    arguments:
    - position: 99
      separate: yes
      valueFrom: '&& tar -vcf genome.tar ./genomeDir'
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\t\n  var sjFormat = \"False\"\n  var gtfgffFormat = \"False\"\n
          \ var list = $job.inputs.sjdbGTFfile\n  var paths_list = []\n  var joined_paths
          = \"\"\n  \n  if (list) {\n    list.forEach(function(f){return paths_list.push(f.path)})\n
          \   joined_paths = paths_list.join(\" \")\n\n\n    paths_list.forEach(function(f){\n
          \     ext = f.replace(/^.*\\./, '')\n      if (ext == \"gff\" || ext ==
          \"gtf\") {\n        gtfgffFormat = \"True\"\n        return gtfgffFormat\n
          \     }\n      if (ext == \"txt\") {\n        sjFormat = \"True\"\n        return
          sjFormat\n      }\n    })\n\n    if ($job.inputs.sjdbGTFfile && $job.inputs.sjdbInsertSave
          != \"None\") {\n      if (sjFormat == \"True\") {\n        return \"--sjdbFileChrStartEnd
          \".concat(joined_paths)\n      }\n      else if (gtfgffFormat == \"True\")
          {\n        return \"--sjdbGTFfile \".concat(joined_paths)\n      }\n    }\n
          \ }\n}"
        class: Expression
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 384.0832266
    'y': 446.4998957
  sbg:x: 100.0
  sbg:y: 200.0
- id: '#SBG_FASTQ_Quality_Detector'
  inputs:
  - id: '#SBG_FASTQ_Quality_Detector.fastq'
    source: '#fastq'
  outputs:
  - id: '#SBG_FASTQ_Quality_Detector.result'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 3
    sbg:job:
      allocatedResources:
        mem: 1000
        cpu: 1
      inputs:
        fastq:
          size: 0
          secondaryFiles: []
          class: File
          path: /path/to/fastq.ext
    sbg:toolAuthor: Seven Bridges Genomics
    sbg:createdOn: 1450911312
    sbg:categories:
    - FASTQ-Processing
    sbg:contributors:
    - bix-demo
    sbg:project: bix-demo/sbgtools-demo
    sbg:createdBy: bix-demo
    sbg:id: sevenbridges/public-apps/sbg-fastq-quality-detector/3
    sbg:license: Apache License 2.0
    sbg:revision: 3
    sbg:cmdPreview: python /opt/sbg_fastq_quality_scale_detector.py --fastq /path/to/fastq.ext
      /path/to/fastq.ext
    sbg:modifiedOn: 1450911314
    sbg:modifiedBy: bix-demo
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911312
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911314
      sbg:revision: 3
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911313
      sbg:revision: 1
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911313
      sbg:revision: 2
    sbg:toolkit: SBGTools
    id: sevenbridges/public-apps/sbg-fastq-quality-detector/3
    inputs:
    - type:
      - File
      label: Fastq
      description: FASTQ file.
      streamable: no
      id: '#fastq'
      inputBinding:
        position: 0
        prefix: --fastq
        separate: yes
        sbg:cmdInclude: yes
      required: yes
    outputs:
    - type:
      - 'null'
      - File
      label: Result
      description: Source FASTQ file with updated metadata.
      streamable: no
      id: '#result'
      outputBinding:
        glob: '*.fastq'
      sbg:fileTypes: FASTQ
    requirements:
    - class: CreateFileRequirement
      fileDef: []
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/tziotas/sbg_fastq_quality_scale_detector:1.0
      dockerImageId: ''
    - class: sbg:CPURequirement
      value: 1
    - class: sbg:MemRequirement
      value: 1000
    label: SBG FASTQ Quality Detector
    description: FASTQ Quality Scale Detector detects which quality encoding scheme
      was used in your reads and automatically enters the proper value in the "Quality
      Scale" metadata field.
    class: CommandLineTool
    arguments: []
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 375.3333179
    'y': 323.5833156
  sbg:x: 300.0
  sbg:y: 200.0
- id: '#Picard_SortSam'
  inputs:
  - id: '#Picard_SortSam.validation_stringency'
    default: SILENT
  - id: '#Picard_SortSam.sort_order'
    default: Coordinate
  - id: '#Picard_SortSam.quiet'
  - id: '#Picard_SortSam.output_type'
  - id: '#Picard_SortSam.memory_per_job'
  - id: '#Picard_SortSam.max_records_in_ram'
  - id: '#Picard_SortSam.input_bam'
    source: '#STAR.aligned_reads'
  - id: '#Picard_SortSam.create_index'
    default: 'True'
  - id: '#Picard_SortSam.compression_level'
  outputs:
  - id: '#Picard_SortSam.sorted_bam'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 2
    sbg:job:
      allocatedResources:
        mem: 2048
        cpu: 1
      inputs:
        sort_order: Coordinate
        input_bam:
          path: /root/dir/example.tested.bam
        memory_per_job: 2048
        output_type: ~
        create_index: ~
    sbg:toolAuthor: Broad Institute
    sbg:createdOn: 1450911168
    sbg:categories:
    - SAM/BAM-Processing
    sbg:contributors:
    - bix-demo
    sbg:links:
    - id: http://broadinstitute.github.io/picard/index.html
      label: Homepage
    - id: https://github.com/broadinstitute/picard/releases/tag/1.138
      label: Source Code
    - id: http://broadinstitute.github.io/picard/
      label: Wiki
    - id: https://github.com/broadinstitute/picard/zipball/master
      label: Download
    - id: http://broadinstitute.github.io/picard/
      label: Publication
    sbg:project: bix-demo/picard-1-140-demo
    sbg:createdBy: bix-demo
    sbg:toolkitVersion: '1.140'
    sbg:id: sevenbridges/public-apps/picard-sortsam-1-140/2
    sbg:license: MIT License, Apache 2.0 Licence
    sbg:revision: 2
    sbg:cmdPreview: java -Xmx2048M -jar /opt/picard-tools-1.140/picard.jar SortSam
      OUTPUT=example.tested.sorted.bam INPUT=/root/dir/example.tested.bam SORT_ORDER=coordinate   INPUT=/root/dir/example.tested.bam
      SORT_ORDER=coordinate  /root/dir/example.tested.bam
    sbg:modifiedOn: 1450911170
    sbg:modifiedBy: bix-demo
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911168
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911169
      sbg:revision: 1
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911170
      sbg:revision: 2
    sbg:toolkit: Picard
    id: sevenbridges/public-apps/picard-sortsam-1-140/2
    inputs:
    - type:
      - 'null'
      - name: validation_stringency
        symbols:
        - STRICT
        - LENIENT
        - SILENT
        type: enum
      label: Validation stringency
      description: 'Validation stringency for all SAM files read by this program.
        Setting stringency to SILENT can improve performance when processing a BAM
        file in which variable-length data (read, qualities, tags) do not otherwise
        need to be decoded. This option can be set to ''null'' to clear the default
        value. Possible values: {STRICT, LENIENT, SILENT}.'
      streamable: no
      id: '#validation_stringency'
      inputBinding:
        position: 0
        prefix: VALIDATION_STRINGENCY=
        separate: no
        valueFrom:
          engine: '#cwl-js-engine'
          script: |-
            {
              if ($job.inputs.validation_stringency)
              {
                return $job.inputs.validation_stringency
              }
              else
              {
                return "SILENT"
              }
            }
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: SILENT
      required: no
    - type:
      - name: sort_order
        symbols:
        - Unsorted
        - Queryname
        - Coordinate
        type: enum
      label: Sort order
      description: 'Sort order of the output file. Possible values: {unsorted, queryname,
        coordinate}.'
      streamable: no
      id: '#sort_order'
      inputBinding:
        position: 3
        prefix: SORT_ORDER=
        separate: no
        valueFrom:
          engine: '#cwl-js-engine'
          script: |-
            {
              p = $job.inputs.sort_order.toLowerCase()
              return p
            }
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: Coordinate
      sbg:altPrefix: SO
      required: yes
    - type:
      - 'null'
      - name: quiet
        symbols:
        - 'True'
        - 'False'
        type: enum
      label: Quiet
      description: 'This parameter indicates whether to suppress job-summary info
        on System.err. This option can be set to ''null'' to clear the default value.
        Possible values: {true, false}.'
      streamable: no
      id: '#quiet'
      inputBinding:
        position: 0
        prefix: QUIET=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: 'False'
      required: no
    - type:
      - 'null'
      - name: output_type
        symbols:
        - BAM
        - SAM
        - SAME AS INPUT
        type: enum
      label: Output format
      description: Since Picard tools can output both SAM and BAM files, user can
        choose the format of the output file.
      streamable: no
      id: '#output_type'
      sbg:category: Other input types
      sbg:toolDefaultValue: SAME AS INPUT
      required: no
    - type:
      - 'null'
      - int
      label: Memory per job
      description: Amount of RAM memory to be used per job. Defaults to 2048 MB for
        single threaded jobs.
      streamable: no
      id: '#memory_per_job'
      sbg:toolDefaultValue: '2048'
      required: no
    - type:
      - 'null'
      - int
      label: Max records in RAM
      description: When writing SAM files that need to be sorted, this parameter will
        specify the number of records stored in RAM before spilling to disk. Increasing
        this number reduces the number of file handles needed to sort a SAM file,
        and increases the amount of RAM needed. This option can be set to 'null' to
        clear the default value.
      streamable: no
      id: '#max_records_in_ram'
      inputBinding:
        position: 0
        prefix: MAX_RECORDS_IN_RAM=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: '500000'
      required: no
    - type:
      - File
      label: Input BAM
      description: The BAM or SAM file to sort.
      streamable: no
      id: '#input_bam'
      inputBinding:
        position: 1
        prefix: INPUT=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: File inputs
      sbg:fileTypes: BAM, SAM
      sbg:altPrefix: I
      required: yes
    - type:
      - 'null'
      - name: create_index
        symbols:
        - 'True'
        - 'False'
        type: enum
      label: Create index
      description: 'This parameter indicates whether to create a BAM index when writing
        a coordinate-sorted BAM file. This option can be set to ''null'' to clear
        the default value. Possible values: {true, false}.'
      streamable: no
      id: '#create_index'
      inputBinding:
        position: 5
        prefix: CREATE_INDEX=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: 'False'
      required: no
    - type:
      - 'null'
      - int
      label: Compression level
      description: Compression level for all compressed files created (e.g. BAM and
        GELI). This option can be set to 'null' to clear the default value.
      streamable: no
      id: '#compression_level'
      inputBinding:
        position: 0
        prefix: COMPRESSION_LEVEL=
        separate: no
        sbg:cmdInclude: yes
      sbg:category: Other input types
      sbg:toolDefaultValue: '5'
      required: no
    outputs:
    - type:
      - 'null'
      - File
      label: Sorted BAM/SAM
      description: Sorted BAM or SAM file.
      streamable: no
      id: '#sorted_bam'
      outputBinding:
        glob: '*.sorted.?am'
      sbg:fileTypes: BAM, SAM
    requirements:
    - class: ExpressionEngineRequirement
      id: '#cwl-js-engine'
      requirements:
      - class: DockerRequirement
        dockerPull: rabix/js-engine
      engineCommand: cwl-engine.js
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/mladenlsbg/picard:1.140
      dockerImageId: eab0e70b6629
    - class: sbg:CPURequirement
      value: 1
    - class: sbg:MemRequirement
      value:
        engine: '#cwl-js-engine'
        script: "{\n  if($job.inputs.memory_per_job){\n  \treturn $job.inputs.memory_per_job\n
          \ }\n  \treturn 2048\n}"
        class: Expression
    label: Picard SortSam
    description: Picard SortSam sorts the input SAM or BAM. Input and output formats
      are determined by the file extension.
    class: CommandLineTool
    arguments:
    - position: 0
      prefix: OUTPUT=
      separate: no
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  filename = $job.inputs.input_bam.path\n  ext = $job.inputs.output_type\n\nif
          (ext === \"BAM\")\n{\n    return filename.split('.').slice(0, -1).concat(\"sorted.bam\").join(\".\").replace(/^.*[\\\\\\/]/,
          '')\n    }\n\nelse if (ext === \"SAM\")\n{\n    return filename.split('.').slice(0,
          -1).concat(\"sorted.sam\").join('.').replace(/^.*[\\\\\\/]/, '')\n}\n\nelse
          \n{\n\treturn filename.split('.').slice(0, -1).concat(\"sorted.\"+filename.split('.').slice(-1)[0]).join(\".\").replace(/^.*[\\\\\\/]/,
          '')\n}\n}"
        class: Expression
    - position: 1000
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  filename = $job.inputs.input_bam.path\n  \n  /* figuring out
          output file type */\n  ext = $job.inputs.output_type\n  if (ext === \"BAM\")\n
          \ {\n    out_extension = \"BAM\"\n  }\n  else if (ext === \"SAM\")\n  {\n
          \   out_extension = \"SAM\"\n  }\n  else \n  {\n\tout_extension = filename.split('.').slice(-1)[0].toUpperCase()\n
          \ }  \n  \n  /* if exist moving .bai in bam.bai */\n  if ($job.inputs.create_index
          === 'True' && $job.inputs.sort_order === 'Coordinate' && out_extension ==
          \"BAM\")\n  {\n    \n    old_name = filename.split('.').slice(0, -1).concat('sorted.bai').join('.').replace(/^.*[\\\\\\/]/,
          '')\n    new_name = filename.split('.').slice(0, -1).concat('sorted.bam.bai').join('.').replace(/^.*[\\\\\\/]/,
          '')\n    return \"; mv \" + \" \" + old_name + \" \" + new_name\n  }\n\n}"
        class: Expression
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 773.0831807
    'y': 470.9165939
  sbg:x: 500.0
  sbg:y: 200.0
- id: '#STAR'
  inputs:
  - id: '#STAR.winFlankNbins'
  - id: '#STAR.winBinNbits'
  - id: '#STAR.winAnchorMultimapNmax'
    source: '#winAnchorMultimapNmax'
  - id: '#STAR.winAnchorDistNbins'
    source: '#winAnchorDistNbins'
  - id: '#STAR.twopassMode'
  - id: '#STAR.twopass1readsN'
  - id: '#STAR.sjdbScore'
  - id: '#STAR.sjdbOverhang'
    default: 100
  - id: '#STAR.sjdbInsertSave'
  - id: '#STAR.sjdbGTFtagExonParentTranscript'
  - id: '#STAR.sjdbGTFtagExonParentGene'
  - id: '#STAR.sjdbGTFfile'
    source: '#sjdbGTFfile'
  - id: '#STAR.sjdbGTFfeatureExon'
  - id: '#STAR.sjdbGTFchrPrefix'
  - id: '#STAR.seedSearchStartLmaxOverLread'
  - id: '#STAR.seedSearchStartLmax'
  - id: '#STAR.seedSearchLmax'
  - id: '#STAR.seedPerWindowNmax'
  - id: '#STAR.seedPerReadNmax'
  - id: '#STAR.seedNoneLociPerWindow'
  - id: '#STAR.seedMultimapNmax'
  - id: '#STAR.scoreStitchSJshift'
  - id: '#STAR.scoreInsOpen'
  - id: '#STAR.scoreInsBase'
  - id: '#STAR.scoreGenomicLengthLog2scale'
  - id: '#STAR.scoreGapNoncan'
  - id: '#STAR.scoreGapGCAG'
  - id: '#STAR.scoreGapATAC'
  - id: '#STAR.scoreGap'
  - id: '#STAR.scoreDelOpen'
  - id: '#STAR.scoreDelBase'
  - id: '#STAR.rg_seq_center'
  - id: '#STAR.rg_sample_id'
  - id: '#STAR.rg_platform_unit_id'
  - id: '#STAR.rg_platform'
  - id: '#STAR.rg_mfl'
  - id: '#STAR.rg_library_id'
  - id: '#STAR.reads'
    source: '#SBG_FASTQ_Quality_Detector.result'
  - id: '#STAR.readMatesLengthsIn'
  - id: '#STAR.readMapNumber'
  - id: '#STAR.quantTranscriptomeBan'
  - id: '#STAR.quantMode'
    default: TranscriptomeSAM
  - id: '#STAR.outSortingType'
    default: SortedByCoordinate
  - id: '#STAR.outSJfilterReads'
  - id: '#STAR.outSJfilterOverhangMin'
  - id: '#STAR.outSJfilterIntronMaxVsReadN'
  - id: '#STAR.outSJfilterDistToOtherSJmin'
  - id: '#STAR.outSJfilterCountUniqueMin'
  - id: '#STAR.outSJfilterCountTotalMin'
  - id: '#STAR.outSAMunmapped'
  - id: '#STAR.outSAMtype'
    default: BAM
  - id: '#STAR.outSAMstrandField'
  - id: '#STAR.outSAMreadID'
  - id: '#STAR.outSAMprimaryFlag'
  - id: '#STAR.outSAMorder'
  - id: '#STAR.outSAMmode'
  - id: '#STAR.outSAMmapqUnique'
  - id: '#STAR.outSAMheaderPG'
  - id: '#STAR.outSAMheaderHD'
  - id: '#STAR.outSAMflagOR'
  - id: '#STAR.outSAMflagAND'
  - id: '#STAR.outSAMattributes'
  - id: '#STAR.outReadsUnmapped'
    default: Fastx
  - id: '#STAR.outQSconversionAdd'
  - id: '#STAR.outFilterType'
  - id: '#STAR.outFilterScoreMinOverLread'
  - id: '#STAR.outFilterScoreMin'
  - id: '#STAR.outFilterMultimapScoreRange'
  - id: '#STAR.outFilterMultimapNmax'
  - id: '#STAR.outFilterMismatchNoverReadLmax'
  - id: '#STAR.outFilterMismatchNoverLmax'
  - id: '#STAR.outFilterMismatchNmax'
  - id: '#STAR.outFilterMatchNminOverLread'
  - id: '#STAR.outFilterMatchNmin'
  - id: '#STAR.outFilterIntronMotifs'
  - id: '#STAR.limitSjdbInsertNsj'
  - id: '#STAR.limitOutSJoneRead'
  - id: '#STAR.limitOutSJcollapsed'
  - id: '#STAR.limitBAMsortRAM'
  - id: '#STAR.genomeDirName'
  - id: '#STAR.genome'
    source: '#STAR_Genome_Generate.genome'
  - id: '#STAR.clip5pNbases'
  - id: '#STAR.clip3pNbases'
  - id: '#STAR.clip3pAfterAdapterNbases'
  - id: '#STAR.clip3pAdapterSeq'
  - id: '#STAR.clip3pAdapterMMp'
  - id: '#STAR.chimSegmentMin'
  - id: '#STAR.chimScoreSeparation'
  - id: '#STAR.chimScoreMin'
  - id: '#STAR.chimScoreJunctionNonGTAG'
  - id: '#STAR.chimScoreDropMax'
  - id: '#STAR.chimOutType'
  - id: '#STAR.chimJunctionOverhangMin'
  - id: '#STAR.alignWindowsPerReadNmax'
  - id: '#STAR.alignTranscriptsPerWindowNmax'
  - id: '#STAR.alignTranscriptsPerReadNmax'
  - id: '#STAR.alignSplicedMateMapLminOverLmate'
  - id: '#STAR.alignSplicedMateMapLmin'
  - id: '#STAR.alignSoftClipAtReferenceEnds'
  - id: '#STAR.alignSJoverhangMin'
  - id: '#STAR.alignSJDBoverhangMin'
  - id: '#STAR.alignMatesGapMax'
  - id: '#STAR.alignIntronMin'
  - id: '#STAR.alignIntronMax'
  - id: '#STAR.alignEndsType'
  outputs:
  - id: '#STAR.unmapped_reads'
  - id: '#STAR.transcriptome_aligned_reads'
  - id: '#STAR.splice_junctions'
  - id: '#STAR.reads_per_gene'
  - id: '#STAR.log_files'
  - id: '#STAR.intermediate_genome'
  - id: '#STAR.chimeric_junctions'
  - id: '#STAR.chimeric_alignments'
  - id: '#STAR.aligned_reads'
  hints: []
  run:
    sbg:validationErrors: []
    sbg:sbgMaintained: no
    sbg:latestRevision: 4
    sbg:job:
      allocatedResources:
        mem: 60000
        cpu: 15
      inputs:
        alignWindowsPerReadNmax: 0
        outSAMheaderPG: outSAMheaderPG
        GENOME_DIR_NAME: ''
        outFilterMatchNminOverLread: 0
        rg_platform_unit_id: rg_platform_unit
        alignTranscriptsPerReadNmax: 0
        readMapNumber: 0
        alignSplicedMateMapLminOverLmate: 0
        alignMatesGapMax: 0
        outFilterMultimapNmax: 0
        clip5pNbases:
        - 0
        outSAMstrandField: None
        readMatesLengthsIn: NotEqual
        outSAMattributes: Standard
        seedMultimapNmax: 0
        rg_mfl: rg_mfl
        chimSegmentMin: 0
        winAnchorDistNbins: 0
        outSortingType: SortedByCoordinate
        outFilterMultimapScoreRange: 0
        sjdbInsertSave: Basic
        clip3pAfterAdapterNbases:
        - 0
        scoreDelBase: 0
        outFilterMatchNmin: 0
        twopass1readsN: 0
        outSAMunmapped: None
        genome:
          size: 0
          secondaryFiles: []
          class: File
          path: genome.ext
        sjdbGTFtagExonParentTranscript: ''
        limitBAMsortRAM: 0
        alignEndsType: Local
        seedNoneLociPerWindow: 0
        rg_sample_id: rg_sample
        sjdbGTFtagExonParentGene: ''
        chimScoreMin: 0
        outSJfilterIntronMaxVsReadN:
        - 0
        twopassMode: Basic
        alignSplicedMateMapLmin: 0
        outSJfilterReads: All
        outSAMprimaryFlag: OneBestScore
        outSJfilterCountTotalMin:
        - 3
        - 1
        - 1
        - 1
        outSAMorder: Paired
        outSAMflagAND: 0
        chimScoreSeparation: 0
        alignSJoverhangMin: 0
        outFilterScoreMin: 0
        seedSearchStartLmax: 0
        scoreGapGCAG: 0
        scoreGenomicLengthLog2scale: 0
        outFilterIntronMotifs: None
        outFilterMismatchNmax: 0
        reads:
        - size: 0
          secondaryFiles: []
          class: File
          metadata:
            format: fastq
            paired_end: '1'
            seq_center: illumina
          path: /test-data/mate_1.fastq.bz2
        scoreGap: 0
        outSJfilterOverhangMin:
        - 30
        - 12
        - 12
        - 12
        outSAMflagOR: 0
        outSAMmode: Full
        rg_library_id: ''
        chimScoreJunctionNonGTAG: 0
        scoreInsOpen: 0
        clip3pAdapterSeq:
        - clip3pAdapterSeq
        chimScoreDropMax: 0
        outFilterType: Normal
        scoreGapATAC: 0
        rg_platform: Ion Torrent PGM
        clip3pAdapterMMp:
        - 0
        sjdbGTFfeatureExon: ''
        outQSconversionAdd: 0
        quantMode: TranscriptomeSAM
        alignIntronMin: 0
        scoreInsBase: 0
        scoreGapNoncan: 0
        seedSearchLmax: 0
        outSJfilterDistToOtherSJmin:
        - 0
        outFilterScoreMinOverLread: 0
        alignSJDBoverhangMin: 0
        limitOutSJcollapsed: 0
        winAnchorMultimapNmax: 0
        outFilterMismatchNoverLmax: 0
        rg_seq_center: ''
        outSAMheaderHD: outSAMheaderHD
        chimOutType: Within
        quantTranscriptomeBan: IndelSoftclipSingleend
        limitOutSJoneRead: 0
        alignTranscriptsPerWindowNmax: 0
        sjdbOverhang: ~
        outReadsUnmapped: Fastx
        scoreStitchSJshift: 0
        seedPerWindowNmax: 0
        outSJfilterCountUniqueMin:
        - 3
        - 1
        - 1
        - 1
        scoreDelOpen: 0
        sjdbGTFfile:
        - path: /demo/test-data/chr20.gtf
        clip3pNbases:
        - 0
        - 3
        winBinNbits: 0
        sjdbScore: ~
        seedSearchStartLmaxOverLread: 0
        alignIntronMax: 0
        seedPerReadNmax: 0
        outFilterMismatchNoverReadLmax: 0
        winFlankNbins: 0
        sjdbGTFchrPrefix: chrPrefix
        alignSoftClipAtReferenceEnds: 'Yes'
        outSAMreadID: Standard
        outSAMtype: BAM
        chimJunctionOverhangMin: 0
        limitSjdbInsertNsj: 0
        outSAMmapqUnique: 0
    sbg:toolAuthor: Alexander Dobin/CSHL
    sbg:createdOn: 1450911471
    sbg:categories:
    - Alignment
    sbg:contributors:
    - ana_d
    - bix-demo
    - uros_sipetic
    sbg:links:
    - id: https://github.com/alexdobin/STAR
      label: Homepage
    - id: https://github.com/alexdobin/STAR/releases
      label: Releases
    - id: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf
      label: Manual
    - id: https://groups.google.com/forum/#!forum/rna-star
      label: Support
    - id: http://www.ncbi.nlm.nih.gov/pubmed/23104886
      label: Publication
    sbg:project: bix-demo/star-2-4-2a-demo
    sbg:createdBy: bix-demo
    sbg:toolkitVersion: 2.4.2a
    sbg:id: sevenbridges/public-apps/star/4
    sbg:license: GNU General Public License v3.0 only
    sbg:revision: 4
    sbg:cmdPreview: tar -xvf genome.ext && /opt/STAR --runThreadN 15  --readFilesCommand
      bzcat  --sjdbGTFfile /demo/test-data/chr20.gtf  --sjdbGTFchrPrefix chrPrefix
      --sjdbInsertSave Basic  --twopass1readsN 0  --chimOutType WithinBAM  --outSAMattrRGline
      ID:1 CN:illumina PI:rg_mfl PL:Ion_Torrent_PGM PU:rg_platform_unit SM:rg_sample  --quantMode
      TranscriptomeSAM --outFileNamePrefix ./mate_1.fastq.bz2.  --readFilesIn /test-data/mate_1.fastq.bz2  &&
      tar -vcf mate_1.fastq.bz2._STARgenome.tar ./mate_1.fastq.bz2._STARgenome  &&
      mv mate_1.fastq.bz2.Unmapped.out.mate1 mate_1.fastq.bz2.Unmapped.out.mate1.fastq
    sbg:modifiedOn: 1462889222
    sbg:modifiedBy: ana_d
    sbg:revisionsInfo:
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911471
      sbg:revision: 0
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911473
      sbg:revision: 1
    - sbg:modifiedBy: bix-demo
      sbg:modifiedOn: 1450911475
      sbg:revision: 2
    - sbg:modifiedBy: uros_sipetic
      sbg:modifiedOn: 1462878528
      sbg:revision: 3
    - sbg:modifiedBy: ana_d
      sbg:modifiedOn: 1462889222
      sbg:revision: 4
    sbg:toolkit: STAR
    id: sevenbridges/public-apps/star/4
    inputs:
    - type:
      - 'null'
      - int
      label: Flanking regions size
      description: =log2(winFlank), where win Flank is the size of the left and right
        flanking regions for each window (int>0).
      streamable: no
      id: '#winFlankNbins'
      inputBinding:
        position: 0
        prefix: --winFlankNbins
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '4'
      required: no
    - type:
      - 'null'
      - int
      label: Bin size
      description: =log2(winBin), where winBin is the size of the bin for the windows/clustering,
        each window will occupy an integer number of bins (int>0).
      streamable: no
      id: '#winBinNbits'
      inputBinding:
        position: 0
        prefix: --winBinNbits
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:includeInPorts: yes
      sbg:toolDefaultValue: '16'
      required: no
    - type:
      - 'null'
      - int
      label: Max loci anchors
      description: Max number of loci anchors are allowed to map to (int>0).
      streamable: no
      id: '#winAnchorMultimapNmax'
      inputBinding:
        position: 0
        prefix: --winAnchorMultimapNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:toolDefaultValue: '50'
      required: no
    - type:
      - 'null'
      - int
      label: Max bins between anchors
      description: Max number of bins between two anchors that allows aggregation
        of anchors into one window (int>0).
      streamable: no
      id: '#winAnchorDistNbins'
      inputBinding:
        position: 0
        prefix: --winAnchorDistNbins
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Windows, Anchors, Binning
      sbg:toolDefaultValue: '9'
      required: no
    - type:
      - 'null'
      - name: twopassMode
        symbols:
        - None
        - Basic
        type: enum
      label: Two-pass mode
      description: '2-pass mapping mode. None: 1-pass mapping; Basic: basic 2-pass
        mapping, with all 1st pass junctions inserted into the genome indices on the
        fly.'
      streamable: no
      id: '#twopassMode'
      inputBinding:
        position: 0
        prefix: --twopassMode
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: 2-pass mapping
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - int
      label: Reads to process in 1st step
      description: 'Number of reads to process for the 1st step. 0: 1-step only, no
        2nd pass; use very large number to map all reads in the first step (int>0).'
      streamable: no
      id: '#twopass1readsN'
      sbg:category: 2-pass mapping
      sbg:toolDefaultValue: '-1'
      required: no
    - type:
      - 'null'
      - int
      label: Extra alignment score
      description: Extra alignment score for alignments that cross database junctions.
      streamable: no
      id: '#sjdbScore'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: '2'
      required: no
    - type:
      - 'null'
      - int
      label: '"Overhang" length'
      description: Length of the donor/acceptor sequence on each side of the junctions,
        ideally = (mate_length - 1) (int >= 0), if int = 0, splice junction database
        is not used.
      streamable: no
      id: '#sjdbOverhang'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: '100'
      required: no
    - type:
      - 'null'
      - name: sjdbInsertSave
        symbols:
        - Basic
        - All
        - None
        type: enum
      label: Save junction files
      description: 'Which files to save when sjdb junctions are inserted on the fly
        at the mapping step. None: not saving files at all; Basic: only small junction/transcript
        files; All: all files including big Genome, SA and SAindex. These files are
        output as archive.'
      streamable: no
      id: '#sjdbInsertSave'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - string
      label: Exons' parents name
      description: Tag name to be used as exons’ transcript-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentTranscript'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: transcript_id
      required: no
    - type:
      - 'null'
      - string
      label: Gene name
      description: Tag name to be used as exons’ gene-parents.
      streamable: no
      id: '#sjdbGTFtagExonParentGene'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: gene_id
      required: no
    - type:
      - 'null'
      - items: File
        type: array
      label: Splice junction file
      description: Gene model annotations and/or known transcripts. No need to include
        this input, except in case of using "on the fly" annotations.
      streamable: no
      id: '#sjdbGTFfile'
      sbg:category: Basic
      sbg:fileTypes: GTF, GFF, TXT
      required: no
    - type:
      - 'null'
      - string
      label: Set exons feature
      description: Feature type in GTF file to be used as exons for building transcripts.
      streamable: no
      id: '#sjdbGTFfeatureExon'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: exon
      required: no
    - type:
      - 'null'
      - string
      label: Chromosome names
      description: Prefix for chromosome names in a GTF file (e.g. 'chr' for using
        ENSMEBL annotations with UCSC geneomes).
      streamable: no
      id: '#sjdbGTFchrPrefix'
      sbg:category: Splice junctions database
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - float
      label: Search start point normalized
      description: seedSearchStartLmax normalized to read length (sum of mates' lengths
        for paired-end reads).
      streamable: no
      id: '#seedSearchStartLmaxOverLread'
      inputBinding:
        position: 0
        prefix: --seedSearchStartLmaxOverLread
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '1.0'
      required: no
    - type:
      - 'null'
      - int
      label: Search start point
      description: Defines the search start point through the read - the read is split
        into pieces no longer than this value (int>0).
      streamable: no
      id: '#seedSearchStartLmax'
      inputBinding:
        position: 0
        prefix: --seedSearchStartLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '50'
      required: no
    - type:
      - 'null'
      - int
      label: Max seed length
      description: Defines the maximum length of the seeds, if =0 max seed length
        is infinite (int>=0).
      streamable: no
      id: '#seedSearchLmax'
      inputBinding:
        position: 0
        prefix: --seedSearchLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Max seeds per window
      description: Max number of seeds per window (int>=0).
      streamable: no
      id: '#seedPerWindowNmax'
      inputBinding:
        position: 0
        prefix: --seedPerWindowNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '50'
      required: no
    - type:
      - 'null'
      - int
      label: Max seeds per read
      description: Max number of seeds per read (int>=0).
      streamable: no
      id: '#seedPerReadNmax'
      inputBinding:
        position: 0
        prefix: --seedPerReadNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '1000'
      required: no
    - type:
      - 'null'
      - int
      label: Max one-seed loci per window
      description: Max number of one seed loci per window (int>=0).
      streamable: no
      id: '#seedNoneLociPerWindow'
      inputBinding:
        position: 0
        prefix: --seedNoneLociPerWindow
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - int
      label: Filter pieces for stitching
      description: Only pieces that map fewer than this value are utilized in the
        stitching procedure (int>=0).
      streamable: no
      id: '#seedMultimapNmax'
      inputBinding:
        position: 0
        prefix: --seedMultimapNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10000'
      required: no
    - type:
      - 'null'
      - int
      label: Max score reduction
      description: Maximum score reduction while searching for SJ boundaries in the
        stitching step.
      streamable: no
      id: '#scoreStitchSJshift'
      inputBinding:
        position: 0
        prefix: --scoreStitchSJshift
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - int
      label: Insertion Open Penalty
      description: Insertion open penalty.
      streamable: no
      id: '#scoreInsOpen'
      inputBinding:
        position: 0
        prefix: --scoreInsOpen
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - int
      label: Insertion extension penalty
      description: Insertion extension penalty per base (in addition to --scoreInsOpen).
      streamable: no
      id: '#scoreInsBase'
      inputBinding:
        position: 0
        prefix: --scoreInsBase
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - float
      label: Log scaled score
      description: 'Extra score logarithmically scaled with genomic length of the
        alignment: <int>*log2(genomicLength).'
      streamable: no
      id: '#scoreGenomicLengthLog2scale'
      inputBinding:
        position: 0
        prefix: --scoreGenomicLengthLog2scale
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-0.25'
      required: no
    - type:
      - 'null'
      - int
      label: Non-canonical gap open
      description: Non-canonical gap open penalty (in addition to --scoreGap).
      streamable: no
      id: '#scoreGapNoncan'
      inputBinding:
        position: 0
        prefix: --scoreGapNoncan
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-8'
      required: no
    - type:
      - 'null'
      - int
      label: GC/AG and CT/GC gap open
      description: GC/AG and CT/GC gap open penalty (in addition to --scoreGap).
      streamable: no
      id: '#scoreGapGCAG'
      inputBinding:
        position: 0
        prefix: --scoreGapGCAG
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-4'
      required: no
    - type:
      - 'null'
      - int
      label: AT/AC and GT/AT gap open
      description: AT/AC and GT/AT gap open penalty (in addition to --scoreGap).
      streamable: no
      id: '#scoreGapATAC'
      inputBinding:
        position: 0
        prefix: --scoreGapATAC
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-8'
      required: no
    - type:
      - 'null'
      - int
      label: Gap open penalty
      description: Gap open penalty.
      streamable: no
      id: '#scoreGap'
      inputBinding:
        position: 0
        prefix: --scoreGap
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Deletion open penalty
      description: Deletion open penalty.
      streamable: no
      id: '#scoreDelOpen'
      inputBinding:
        position: 0
        prefix: --scoreDelOpen
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - int
      label: Deletion extension penalty
      description: Deletion extension penalty per base (in addition to --scoreDelOpen).
      streamable: no
      id: '#scoreDelBase'
      inputBinding:
        position: 0
        prefix: --scoreDelBase
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Scoring
      sbg:toolDefaultValue: '-2'
      required: no
    - type:
      - 'null'
      - string
      label: Sequencing center
      description: Specify the sequencing center for RG line.
      streamable: no
      id: '#rg_seq_center'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Sample ID
      description: Specify the sample ID for RG line.
      streamable: no
      id: '#rg_sample_id'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Platform unit ID
      description: Specify the platform unit ID for RG line.
      streamable: no
      id: '#rg_platform_unit_id'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - name: rg_platform
        symbols:
        - LS 454
        - Helicos
        - Illumina
        - ABI SOLiD
        - Ion Torrent PGM
        - PacBio
        type: enum
      label: Platform
      description: Specify the version of the technology that was used for sequencing
        or assaying.
      streamable: no
      id: '#rg_platform'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Median fragment length
      description: Specify the median fragment length for RG line.
      streamable: no
      id: '#rg_mfl'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - 'null'
      - string
      label: Library ID
      description: Specify the library ID for RG line.
      streamable: no
      id: '#rg_library_id'
      sbg:category: Read group
      sbg:toolDefaultValue: Inferred from metadata
      required: no
    - type:
      - items: File
        type: array
      label: Read sequence
      description: Read sequence.
      streamable: no
      id: '#reads'
      inputBinding:
        position: 10
        separate: yes
        itemSeparator: ' '
        valueFrom:
          engine: '#cwl-js-engine'
          script: "{\t\n  var list = [].concat($job.inputs.reads)\n  \n  var resp
            = []\n  \n  if (list.length == 1){\n    resp.push(list[0].path)\n    \n
            \ }else if (list.length == 2){    \n    \n    left = \"\"\n    right =
            \"\"\n      \n    for (index = 0; index < list.length; ++index) {\n      \n
            \     if (list[index].metadata != null){\n        if (list[index].metadata.paired_end
            == 1){\n          left = list[index].path\n        }else if (list[index].metadata.paired_end
            == 2){\n          right = list[index].path\n        }\n      }\n    }\n
            \   \n    if (left != \"\" && right != \"\"){      \n      resp.push(left)\n
            \     resp.push(right)\n    }\n  }\n  else if (list.length > 2){\n    left
            = []\n    right = []\n      \n    for (index = 0; index < list.length;
            ++index) {\n      \n      if (list[index].metadata != null){\n        if
            (list[index].metadata.paired_end == 1){\n          left.push(list[index].path)\n
            \       }else if (list[index].metadata.paired_end == 2){\n          right.push(list[index].path)\n
            \       }\n      }\n    }\n    left_join = left.join()\n    right_join
            = right.join()\n    if (left != [] && right != []){      \n      resp.push(left_join)\n
            \     resp.push(right_join)\n    }\t\n  }\n  \n  if(resp.length > 0){
            \   \n    return \"--readFilesIn \".concat(resp.join(\" \"))\n  }\n}"
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Basic
      sbg:fileTypes: FASTA, FASTQ, FA, FQ, FASTQ.GZ, FQ.GZ, FASTQ.BZ2, FQ.BZ2
      required: yes
    - type:
      - 'null'
      - name: readMatesLengthsIn
        symbols:
        - NotEqual
        - Equal
        type: enum
      label: Reads lengths
      description: Equal/Not equal - lengths of names, sequences, qualities for both
        mates are the same/not the same. "Not equal" is safe in all situations.
      streamable: no
      id: '#readMatesLengthsIn'
      inputBinding:
        position: 0
        prefix: --readMatesLengthsIn
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: NotEqual
      required: no
    - type:
      - 'null'
      - int
      label: Reads to map
      description: Number of reads to map from the beginning of the file.
      streamable: no
      id: '#readMapNumber'
      inputBinding:
        position: 0
        prefix: --readMapNumber
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '-1'
      required: no
    - type:
      - 'null'
      - name: quantTranscriptomeBan
        symbols:
        - IndelSoftclipSingleend
        - Singleend
        type: enum
      label: Prohibit alignment type
      description: 'Prohibit various alignment type. IndelSoftclipSingleend: prohibit
        indels, soft clipping and single-end alignments - compatible with RSEM; Singleend:
        prohibit single-end alignments.'
      streamable: no
      id: '#quantTranscriptomeBan'
      inputBinding:
        position: 0
        prefix: --quantTranscriptomeBan
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Quantification of Annotations
      sbg:toolDefaultValue: IndelSoftclipSingleend
      required: no
    - type:
      - 'null'
      - name: quantMode
        symbols:
        - TranscriptomeSAM
        - GeneCounts
        type: enum
      label: Quantification mode
      description: Types of quantification requested. 'TranscriptomeSAM' option outputs
        SAM/BAM alignments to transcriptome into a separate file. With 'GeneCounts'
        option, STAR will count number of reads per gene while mapping.
      streamable: no
      id: '#quantMode'
      sbg:category: Quantification of Annotations
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - name: outSortingType
        symbols:
        - Unsorted
        - SortedByCoordinate
        - Unsorted SortedByCoordinate
        type: enum
      label: Output sorting type
      description: Type of output sorting.
      streamable: no
      id: '#outSortingType'
      sbg:category: Output
      sbg:toolDefaultValue: SortedByCoordinate
      required: no
    - type:
      - 'null'
      - name: outSJfilterReads
        symbols:
        - All
        - Unique
        type: enum
      label: Collapsed junctions reads
      description: 'Which reads to consider for collapsed splice junctions output.
        All: all reads, unique- and multi-mappers; Unique: uniquely mapping reads
        only.'
      streamable: no
      id: '#outSJfilterReads'
      inputBinding:
        position: 0
        prefix: --outSJfilterReads
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: All
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min overhang SJ
      description: Minimum overhang length for splice junctions on both sides for
        each of the motifs. To set no output for desired motif, assign -1 to the corresponding
        field. Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterOverhangMin'
      inputBinding:
        position: 0
        prefix: --outSJfilterOverhangMin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 30 12 12 12
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Max gap allowed
      description: 'Maximum gap allowed for junctions supported by 1,2,3...N reads
        (int >= 0) i.e. by default junctions supported by 1 read can have gaps <=50000b,
        by 2 reads: <=100000b, by 3 reads: <=200000. By 4 or more reads: any gap <=alignIntronMax.
        Does not apply to annotated junctions.'
      streamable: no
      id: '#outSJfilterIntronMaxVsReadN'
      inputBinding:
        position: 0
        prefix: --outSJfilterIntronMaxVsReadN
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 50000 100000 200000
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min distance to other donor/acceptor
      description: Minimum allowed distance to other junctions' donor/acceptor for
        each of the motifs (int >= 0). Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterDistToOtherSJmin'
      inputBinding:
        position: 0
        prefix: --outSJfilterDistToOtherSJmin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 10 0 5 10
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min unique count
      description: Minimum uniquely mapping read count per junction for each of the
        motifs. To set no output for desired motif, assign -1 to the corresponding
        field. Junctions are output if one of --outSJfilterCountUniqueMin OR --outSJfilterCountTotalMin
        conditions are satisfied. Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterCountUniqueMin'
      inputBinding:
        position: 0
        prefix: --outSJfilterCountUniqueMin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 3 1 1 1
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Min total count
      description: Minimum total (multi-mapping+unique) read count per junction for
        each of the motifs. To set no output for desired motif, assign -1 to the corresponding
        field. Junctions are output if one of --outSJfilterCountUniqueMin OR --outSJfilterCountTotalMin
        conditions are satisfied. Does not apply to annotated junctions.
      streamable: no
      id: '#outSJfilterCountTotalMin'
      inputBinding:
        position: 0
        prefix: --outSJfilterCountTotalMin
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: 'Output filtering: splice junctions'
      sbg:toolDefaultValue: 3 1 1 1
      required: no
    - type:
      - 'null'
      - name: outSAMunmapped
        symbols:
        - None
        - Within
        type: enum
      label: Write unmapped in SAM
      description: 'Output of unmapped reads in the SAM format. None: no output Within:
        output unmapped reads within the main SAM file (i.e. Aligned.out.sam).'
      streamable: no
      id: '#outSAMunmapped'
      inputBinding:
        position: 0
        prefix: --outSAMunmapped
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - name: outSAMtype
        symbols:
        - SAM
        - BAM
        type: enum
      label: Output format
      description: Format of output alignments.
      streamable: no
      id: '#outSAMtype'
      inputBinding:
        position: 0
        separate: yes
        valueFrom:
          engine: '#cwl-js-engine'
          script: |-
            {
              SAM_type = $job.inputs.outSAMtype
              SORT_type = $job.inputs.outSortingType
              if (SAM_type && SORT_type) {
                return "--outSAMtype ".concat(SAM_type, " ", SORT_type)
              }
            }
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: SAM
      required: no
    - type:
      - 'null'
      - name: outSAMstrandField
        symbols:
        - None
        - intronMotif
        type: enum
      label: Strand field flag
      description: 'Cufflinks-like strand field flag. None: not used; intronMotif:
        strand derived from the intron motif. Reads with inconsistent and/or non-canonical
        introns are filtered out.'
      streamable: no
      id: '#outSAMstrandField'
      inputBinding:
        position: 0
        prefix: --outSAMstrandField
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - name: outSAMreadID
        symbols:
        - Standard
        - Number
        type: enum
      label: Read ID
      description: 'Read ID record type. Standard: first word (until space) from the
        FASTx read ID line, removing /1,/2 from the end; Number: read number (index)
        in the FASTx file.'
      streamable: no
      id: '#outSAMreadID'
      inputBinding:
        position: 0
        prefix: --outSAMreadID
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Standard
      required: no
    - type:
      - 'null'
      - name: outSAMprimaryFlag
        symbols:
        - OneBestScore
        - AllBestScore
        type: enum
      label: Primary alignments
      description: 'Which alignments are considered primary - all others will be marked
        with 0x100 bit in the FLAG. OneBestScore: only one alignment with the best
        score is primary; AllBestScore: all alignments with the best score are primary.'
      streamable: no
      id: '#outSAMprimaryFlag'
      inputBinding:
        position: 0
        prefix: --outSAMprimaryFlag
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: OneBestScore
      required: no
    - type:
      - 'null'
      - name: outSAMorder
        symbols:
        - Paired
        - PairedKeepInputOrder
        type: enum
      label: Sorting in SAM
      description: 'Type of sorting for the SAM output. Paired: one mate after the
        other for all paired alignments; PairedKeepInputOrder: one mate after the
        other for all paired alignments, the order is kept the same as in the input
        FASTQ files.'
      streamable: no
      id: '#outSAMorder'
      inputBinding:
        position: 0
        prefix: --outSAMorder
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Paired
      required: no
    - type:
      - 'null'
      - name: outSAMmode
        symbols:
        - Full
        - NoQS
        type: enum
      label: SAM mode
      description: 'Mode of SAM output. Full: full SAM output; NoQS: full SAM but
        without quality scores.'
      streamable: no
      id: '#outSAMmode'
      inputBinding:
        position: 0
        prefix: --outSAMmode
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Full
      required: no
    - type:
      - 'null'
      - int
      label: MAPQ value
      description: MAPQ value for unique mappers (0 to 255).
      streamable: no
      id: '#outSAMmapqUnique'
      inputBinding:
        position: 0
        prefix: --outSAMmapqUnique
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '255'
      required: no
    - type:
      - 'null'
      - string
      label: SAM header @PG
      description: Extra @PG (software) line of the SAM header (in addition to STAR).
      streamable: no
      id: '#outSAMheaderPG'
      inputBinding:
        position: 0
        prefix: --outSAMheaderPG
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - string
      label: SAM header @HD
      description: '@HD (header) line of the SAM header.'
      streamable: no
      id: '#outSAMheaderHD'
      inputBinding:
        position: 0
        prefix: --outSAMheaderHD
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - int
      label: OR SAM flag
      description: Set specific bits of the SAM FLAG.
      streamable: no
      id: '#outSAMflagOR'
      inputBinding:
        position: 0
        prefix: --outSAMflagOR
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: AND SAM flag
      description: Set specific bits of the SAM FLAG.
      streamable: no
      id: '#outSAMflagAND'
      inputBinding:
        position: 0
        prefix: --outSAMflagAND
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '65535'
      required: no
    - type:
      - 'null'
      - name: outSAMattributes
        symbols:
        - Standard
        - NH
        - All
        - None
        type: enum
      label: SAM attributes
      description: 'Desired SAM attributes, in the order desired for the output SAM.
        NH: any combination in any order; Standard: NH HI AS nM; All: NH HI AS nM
        NM MD jM jI; None: no attributes.'
      streamable: no
      id: '#outSAMattributes'
      inputBinding:
        position: 0
        prefix: --outSAMattributes
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: Standard
      required: no
    - type:
      - 'null'
      - name: outReadsUnmapped
        symbols:
        - None
        - Fastx
        type: enum
      label: Output unmapped reads
      description: 'Output of unmapped reads (besides SAM). None: no output; Fastx:
        output in separate fasta/fastq files, Unmapped.out.mate1/2.'
      streamable: no
      id: '#outReadsUnmapped'
      inputBinding:
        position: 0
        prefix: --outReadsUnmapped
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - int
      label: Quality conversion
      description: Add this number to the quality score (e.g. to convert from Illumina
        to Sanger, use -31).
      streamable: no
      id: '#outQSconversionAdd'
      inputBinding:
        position: 0
        prefix: --outQSconversionAdd
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: outFilterType
        symbols:
        - Normal
        - BySJout
        type: enum
      label: Filtering type
      description: 'Type of filtering. Normal: standard filtering using only current
        alignment; BySJout: keep only those reads that contain junctions that passed
        filtering into SJ.out.tab.'
      streamable: no
      id: '#outFilterType'
      inputBinding:
        position: 0
        prefix: --outFilterType
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: Normal
      required: no
    - type:
      - 'null'
      - float
      label: Min score normalized
      description: '''Minimum score'' normalized to read length (sum of mates'' lengths
        for paired-end reads).'
      streamable: no
      id: '#outFilterScoreMinOverLread'
      inputBinding:
        position: 0
        prefix: --outFilterScoreMinOverLread
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0.66'
      required: no
    - type:
      - 'null'
      - int
      label: Min score
      description: Alignment will be output only if its score is higher than this
        value.
      streamable: no
      id: '#outFilterScoreMin'
      inputBinding:
        position: 0
        prefix: --outFilterScoreMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Multimapping score range
      description: The score range below the maximum score for multimapping alignments.
      streamable: no
      id: '#outFilterMultimapScoreRange'
      inputBinding:
        position: 0
        prefix: --outFilterMultimapScoreRange
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - int
      label: Max number of mappings
      description: Read alignments will be output only if the read maps fewer than
        this value, otherwise no alignments will be output.
      streamable: no
      id: '#outFilterMultimapNmax'
      inputBinding:
        position: 0
        prefix: --outFilterMultimapNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - float
      label: Mismatches to *read* length
      description: Alignment will be output only if its ratio of mismatches to *read*
        length is less than this value.
      streamable: no
      id: '#outFilterMismatchNoverReadLmax'
      inputBinding:
        position: 0
        prefix: --outFilterMismatchNoverReadLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '1'
      required: no
    - type:
      - 'null'
      - float
      label: Mismatches to *mapped* length
      description: Alignment will be output only if its ratio of mismatches to *mapped*
        length is less than this value.
      streamable: no
      id: '#outFilterMismatchNoverLmax'
      inputBinding:
        position: 0
        prefix: --outFilterMismatchNoverLmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0.3'
      required: no
    - type:
      - 'null'
      - int
      label: Max number of mismatches
      description: Alignment will be output only if it has fewer mismatches than this
        value.
      streamable: no
      id: '#outFilterMismatchNmax'
      inputBinding:
        position: 0
        prefix: --outFilterMismatchNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - float
      label: Min matched bases normalized
      description: '''Minimum matched bases'' normalized to read length (sum of mates
        lengths for paired-end reads).'
      streamable: no
      id: '#outFilterMatchNminOverLread'
      inputBinding:
        position: 0
        prefix: --outFilterMatchNminOverLread
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0.66'
      required: no
    - type:
      - 'null'
      - int
      label: Min matched bases
      description: Alignment will be output only if the number of matched bases is
        higher than this value.
      streamable: no
      id: '#outFilterMatchNmin'
      inputBinding:
        position: 0
        prefix: --outFilterMatchNmin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: outFilterIntronMotifs
        symbols:
        - None
        - RemoveNoncanonical
        - RemoveNoncanonicalUnannotated
        type: enum
      label: Motifs filtering
      description: 'Filter alignment using their motifs. None: no filtering; RemoveNoncanonical:
        filter out alignments that contain non-canonical junctions; RemoveNoncanonicalUnannotated:
        filter out alignments that contain non-canonical unannotated junctions when
        using annotated splice junctions database. The annotated non-canonical junctions
        will be kept.'
      streamable: no
      id: '#outFilterIntronMotifs'
      inputBinding:
        position: 0
        prefix: --outFilterIntronMotifs
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Output filtering
      sbg:toolDefaultValue: None
      required: no
    - type:
      - 'null'
      - int
      label: Max insert junctions
      description: Maximum number of junction to be inserted to the genome on the
        fly at the mapping stage, including those from annotations and those detected
        in the 1st step of the 2-pass run.
      streamable: no
      id: '#limitSjdbInsertNsj'
      inputBinding:
        position: 0
        prefix: --limitSjdbInsertNsj
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '1000000'
      required: no
    - type:
      - 'null'
      - int
      label: Junctions max number
      description: Max number of junctions for one read (including all multi-mappers).
      streamable: no
      id: '#limitOutSJoneRead'
      inputBinding:
        position: 0
        prefix: --limitOutSJoneRead
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '1000'
      required: no
    - type:
      - 'null'
      - int
      label: Collapsed junctions max number
      description: Max number of collapsed junctions.
      streamable: no
      id: '#limitOutSJcollapsed'
      inputBinding:
        position: 0
        prefix: --limitOutSJcollapsed
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '1000000'
      required: no
    - type:
      - 'null'
      - int
      label: Limit BAM sorting memory
      description: Maximum available RAM for sorting BAM. If set to 0, it will be
        set to the genome index size.
      streamable: no
      id: '#limitBAMsortRAM'
      inputBinding:
        position: 0
        prefix: --limitBAMsortRAM
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Limits
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - string
      label: Genome dir name
      description: Name of the directory which contains genome files (when genome.tar
        is uncompressed).
      streamable: no
      id: '#genomeDirName'
      inputBinding:
        position: 0
        prefix: --genomeDir
        separate: yes
        valueFrom:
          engine: '#cwl-js-engine'
          script: $job.inputs.genomeDirName || "genomeDir"
          class: Expression
        sbg:cmdInclude: yes
      sbg:category: Basic
      sbg:toolDefaultValue: genomeDir
      required: no
    - type:
      - File
      label: Genome files
      description: Genome files created using STAR Genome Generate.
      streamable: no
      id: '#genome'
      sbg:category: Basic
      sbg:fileTypes: TAR
      required: yes
    - type:
      - 'null'
      - items: int
        type: array
      label: Clip 5p bases
      description: Number of bases to clip from 5p of each mate. In case only one
        value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip5pNbases'
      inputBinding:
        position: 0
        prefix: --clip5pNbases
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Clip 3p bases
      description: Number of bases to clip from 3p of each mate. In case only one
        value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip3pNbases'
      inputBinding:
        position: 0
        prefix: --clip3pNbases
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - items: int
        type: array
      label: Clip 3p after adapter seq.
      description: Number of bases to clip from 3p of each mate after the adapter
        clipping. In case only one value is given, it will be assumed the same for
        both mates.
      streamable: no
      id: '#clip3pAfterAdapterNbases'
      inputBinding:
        position: 0
        prefix: --clip3pAfterAdapterNbases
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - items: string
        type: array
      label: Clip 3p adapter sequence
      description: Adapter sequence to clip from 3p of each mate. In case only one
        value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip3pAdapterSeq'
      inputBinding:
        position: 0
        prefix: --clip3pAdapterSeq
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '-'
      required: no
    - type:
      - 'null'
      - items: float
        type: array
      label: Max mismatches proportions
      description: Max proportion of mismatches for 3p adapter clipping for each mate.
        In case only one value is given, it will be assumed the same for both mates.
      streamable: no
      id: '#clip3pAdapterMMp'
      inputBinding:
        position: 0
        prefix: --clip3pAdapterMMp
        separate: yes
        itemSeparator: ' '
        sbg:cmdInclude: yes
      sbg:category: Read parameters
      sbg:toolDefaultValue: '0.1'
      required: no
    - type:
      - 'null'
      - int
      label: Min segment length
      description: Minimum length of chimeric segment length, if =0, no chimeric output
        (int>=0).
      streamable: no
      id: '#chimSegmentMin'
      inputBinding:
        position: 0
        prefix: --chimSegmentMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '15'
      required: no
    - type:
      - 'null'
      - int
      label: Min separation score
      description: Minimum difference (separation) between the best chimeric score
        and the next one (int>=0).
      streamable: no
      id: '#chimScoreSeparation'
      inputBinding:
        position: 0
        prefix: --chimScoreSeparation
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '10'
      required: no
    - type:
      - 'null'
      - int
      label: Min total score
      description: Minimum total (summed) score of the chimeric segments (int>=0).
      streamable: no
      id: '#chimScoreMin'
      inputBinding:
        position: 0
        prefix: --chimScoreMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Non-GT/AG penalty
      description: Penalty for a non-GT/AG chimeric junction.
      streamable: no
      id: '#chimScoreJunctionNonGTAG'
      inputBinding:
        position: 0
        prefix: --chimScoreJunctionNonGTAG
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '-1'
      required: no
    - type:
      - 'null'
      - int
      label: Max drop score
      description: Max drop (difference) of chimeric score (the sum of scores of all
        chimeric segements) from the read length (int>=0).
      streamable: no
      id: '#chimScoreDropMax'
      inputBinding:
        position: 0
        prefix: --chimScoreDropMax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '20'
      required: no
    - type:
      - 'null'
      - name: chimOutType
        symbols:
        - SeparateSAMold
        - Within
        type: enum
      label: Chimeric output type
      description: 'Type of chimeric output. SeparateSAMold: output old SAM into separate
        Chimeric.out.sam file; Within: output into main aligned SAM/BAM files.'
      streamable: no
      id: '#chimOutType'
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: SeparateSAMold
      required: no
    - type:
      - 'null'
      - int
      label: Min junction overhang
      description: Minimum overhang for a chimeric junction (int>=0).
      streamable: no
      id: '#chimJunctionOverhangMin'
      inputBinding:
        position: 0
        prefix: --chimJunctionOverhangMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Chimeric Alignments
      sbg:toolDefaultValue: '20'
      required: no
    - type:
      - 'null'
      - float
      label: Max windows per read
      description: Max number of windows per read (int>0).
      streamable: no
      id: '#alignWindowsPerReadNmax'
      inputBinding:
        position: 0
        prefix: --alignWindowsPerReadNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10000'
      required: no
    - type:
      - 'null'
      - int
      label: Max transcripts per window
      description: Max number of transcripts per window (int>0).
      streamable: no
      id: '#alignTranscriptsPerWindowNmax'
      inputBinding:
        position: 0
        prefix: --alignTranscriptsPerWindowNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '100'
      required: no
    - type:
      - 'null'
      - int
      label: Max transcripts per read
      description: Max number of different alignments per read to consider (int>0).
      streamable: no
      id: '#alignTranscriptsPerReadNmax'
      inputBinding:
        position: 0
        prefix: --alignTranscriptsPerReadNmax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '10000'
      required: no
    - type:
      - 'null'
      - float
      label: Min mapped length normalized
      description: alignSplicedMateMapLmin normalized to mate length (float>0).
      streamable: no
      id: '#alignSplicedMateMapLminOverLmate'
      inputBinding:
        position: 0
        prefix: --alignSplicedMateMapLminOverLmate
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0.66'
      required: no
    - type:
      - 'null'
      - int
      label: Min mapped length
      description: Minimum mapped length for a read mate that is spliced (int>0).
      streamable: no
      id: '#alignSplicedMateMapLmin'
      inputBinding:
        position: 0
        prefix: --alignSplicedMateMapLmin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: alignSoftClipAtReferenceEnds
        symbols:
        - 'Yes'
        - 'No'
        type: enum
      label: Soft clipping
      description: 'Option which allows soft clipping of alignments at the reference
        (chromosome) ends. Can be disabled for compatibility with Cufflinks/Cuffmerge.
        Yes: Enables soft clipping; No: Disables soft clipping.'
      streamable: no
      id: '#alignSoftClipAtReferenceEnds'
      inputBinding:
        position: 0
        prefix: --alignSoftClipAtReferenceEnds
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: 'Yes'
      required: no
    - type:
      - 'null'
      - int
      label: Min overhang
      description: Minimum overhang (i.e. block size) for spliced alignments (int>0).
      streamable: no
      id: '#alignSJoverhangMin'
      inputBinding:
        position: 0
        prefix: --alignSJoverhangMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '5'
      required: no
    - type:
      - 'null'
      - int
      label: 'Min overhang: annotated'
      description: Minimum overhang (i.e. block size) for annotated (sjdb) spliced
        alignments (int>0).
      streamable: no
      id: '#alignSJDBoverhangMin'
      inputBinding:
        position: 0
        prefix: --alignSJDBoverhangMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '3'
      required: no
    - type:
      - 'null'
      - int
      label: Max mates gap
      description: Maximum gap between two mates, if 0, max intron gap will be determined
        by (2^winBinNbits)*winAnchorDistNbins.
      streamable: no
      id: '#alignMatesGapMax'
      inputBinding:
        position: 0
        prefix: --alignMatesGapMax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - int
      label: Min intron size
      description: 'Minimum intron size: genomic gap is considered intron if its length
        >= alignIntronMin, otherwise it is considered Deletion (int>=0).'
      streamable: no
      id: '#alignIntronMin'
      inputBinding:
        position: 0
        prefix: --alignIntronMin
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '21'
      required: no
    - type:
      - 'null'
      - int
      label: Max intron size
      description: Maximum intron size, if 0, max intron size will be determined by
        (2^winBinNbits)*winAnchorDistNbins.
      streamable: no
      id: '#alignIntronMax'
      inputBinding:
        position: 0
        prefix: --alignIntronMax
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: '0'
      required: no
    - type:
      - 'null'
      - name: alignEndsType
        symbols:
        - Local
        - EndToEnd
        type: enum
      label: Alignment type
      description: 'Type of read ends alignment. Local: standard local alignment with
        soft-clipping allowed. EndToEnd: force end to end read alignment, do not soft-clip.'
      streamable: no
      id: '#alignEndsType'
      inputBinding:
        position: 0
        prefix: --alignEndsType
        separate: yes
        sbg:cmdInclude: yes
      sbg:category: Alignments and Seeding
      sbg:toolDefaultValue: Local
      required: no
    outputs:
    - type:
      - 'null'
      - items: File
        type: array
      label: Unmapped reads
      description: Output of unmapped reads.
      streamable: no
      id: '#unmapped_reads'
      outputBinding:
        glob: '*Unmapped.out*'
      sbg:fileTypes: FASTQ
    - type:
      - 'null'
      - File
      label: Transcriptome alignments
      description: Alignments translated into transcript coordinates.
      streamable: no
      id: '#transcriptome_aligned_reads'
      outputBinding:
        glob: '*Transcriptome*'
      sbg:fileTypes: BAM
    - type:
      - 'null'
      - File
      label: Splice junctions
      description: High confidence collapsed splice junctions in tab-delimited format.
        Only junctions supported by uniquely mapping reads are reported.
      streamable: no
      id: '#splice_junctions'
      outputBinding:
        glob: '*SJ.out.tab'
      sbg:fileTypes: TAB
    - type:
      - 'null'
      - File
      label: Reads per gene
      description: File with number of reads per gene. A read is counted if it overlaps
        (1nt or more) one and only one gene.
      streamable: no
      id: '#reads_per_gene'
      outputBinding:
        glob: '*ReadsPerGene*'
      sbg:fileTypes: TAB
    - type:
      - 'null'
      - items: File
        type: array
      label: Log files
      description: Log files produced during alignment.
      streamable: no
      id: '#log_files'
      outputBinding:
        glob: '*Log*.out'
      sbg:fileTypes: OUT
    - type:
      - 'null'
      - File
      label: Intermediate genome files
      description: Archive with genome files produced when annotations are included
        on the fly (in the mapping step).
      streamable: no
      id: '#intermediate_genome'
      outputBinding:
        glob: '*_STARgenome.tar'
      sbg:fileTypes: TAR
    - type:
      - 'null'
      - File
      label: Chimeric junctions
      description: If chimSegmentMin in 'Chimeric Alignments' section is set to 0,
        'Chimeric Junctions' won't be output.
      streamable: no
      id: '#chimeric_junctions'
      outputBinding:
        glob: '*Chimeric.out.junction'
      sbg:fileTypes: JUNCTION
    - type:
      - 'null'
      - File
      label: Chimeric alignments
      description: Aligned Chimeric sequences SAM - if chimSegmentMin = 0, no Chimeric
        Alignment SAM and Chimeric Junctions outputs.
      streamable: no
      id: '#chimeric_alignments'
      outputBinding:
        glob: '*.Chimeric.out.sam'
      sbg:fileTypes: SAM
    - type:
      - 'null'
      - File
      label: Aligned SAM/BAM
      description: Aligned sequence in SAM/BAM format.
      streamable: no
      id: '#aligned_reads'
      outputBinding:
        glob:
          engine: '#cwl-js-engine'
          script: |-
            {
              if ($job.inputs.outSortingType == 'SortedByCoordinate') {
                sort_name = '.sortedByCoord'
              }
              else {
                sort_name = ''
              }
              if ($job.inputs.outSAMtype == 'BAM') {
                sam_name = "*.Aligned".concat( sort_name, '.out.bam')
              }
              else {
                sam_name = "*.Aligned.out.sam"
              }
              return sam_name
            }
          class: Expression
      sbg:fileTypes: SAM, BAM
    requirements:
    - class: ExpressionEngineRequirement
      id: '#cwl-js-engine'
      requirements:
      - class: DockerRequirement
        dockerPull: rabix/js-engine
    hints:
    - class: DockerRequirement
      dockerPull: images.sbgenomics.com/ana_d/star:2.4.2a
      dockerImageId: a4b0ad2c3cae
    - class: sbg:MemRequirement
      value: 60000
    - class: sbg:CPURequirement
      value: 15
    label: STAR
    description: STAR is an ultrafast universal RNA-seq aligner. It has very high
      mapping speed, accurate alignment of contiguous and spliced reads, detection
      of polyA-tails, non-canonical splices and chimeric (fusion) junctions. It works
      with reads starting from lengths ~15 bases up to ~300 bases. In case of having
      longer reads, use of STAR Long is recommended.
    class: CommandLineTool
    arguments:
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            file = [].concat($job.inputs.reads)[0].path
            extension = /(?:\.([^.]+))?$/.exec(file)[1]
            if (extension == "gz") {
              return "--readFilesCommand zcat"
            } else if (extension == "bz2") {
              return "--readFilesCommand bzcat"
            }
          }
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\t\n  var sjFormat = \"False\"\n  var gtfgffFormat = \"False\"\n
          \ var list = $job.inputs.sjdbGTFfile\n  var paths_list = []\n  var joined_paths
          = \"\"\n  \n  if (list) {\n    list.forEach(function(f){return paths_list.push(f.path)})\n
          \   joined_paths = paths_list.join(\" \")\n\n\n    paths_list.forEach(function(f){\n
          \     ext = f.replace(/^.*\\./, '')\n      if (ext == \"gff\" || ext ==
          \"gtf\") {\n        gtfgffFormat = \"True\"\n        return gtfgffFormat\n
          \     }\n      if (ext == \"txt\") {\n        sjFormat = \"True\"\n        return
          sjFormat\n      }\n    })\n\n    if ($job.inputs.sjdbGTFfile && $job.inputs.sjdbInsertSave
          != \"None\") {\n      if (sjFormat == \"True\") {\n        return \"--sjdbFileChrStartEnd
          \".concat(joined_paths)\n      }\n      else if (gtfgffFormat == \"True\")
          {\n        return \"--sjdbGTFfile \".concat(joined_paths)\n      }\n    }\n
          \ }\n}"
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  a = b = c = d = e = f = g = []\n  if ($job.inputs.sjdbGTFchrPrefix)
          {\n    a = [\"--sjdbGTFchrPrefix\", $job.inputs.sjdbGTFchrPrefix]\n  }\n
          \ if ($job.inputs.sjdbGTFfeatureExon) {\n    b = [\"--sjdbGTFfeatureExon\",
          $job.inputs.sjdbGTFfeatureExon]\n  }\n  if ($job.inputs.sjdbGTFtagExonParentTranscript)
          {\n    c = [\"--sjdbGTFtagExonParentTranscript\", $job.inputs.sjdbGTFtagExonParentTranscript]\n
          \ }\n  if ($job.inputs.sjdbGTFtagExonParentGene) {\n    d = [\"--sjdbGTFtagExonParentGene\",
          $job.inputs.sjdbGTFtagExonParentGene]\n  }\n  if ($job.inputs.sjdbOverhang)
          {\n    e = [\"--sjdbOverhang\", $job.inputs.sjdbOverhang]\n  }\n  if ($job.inputs.sjdbScore)
          {\n    f = [\"--sjdbScore\", $job.inputs.sjdbScore]\n  }\n  if ($job.inputs.sjdbInsertSave)
          {\n    g = [\"--sjdbInsertSave\", $job.inputs.sjdbInsertSave]\n  }\n  \n
          \ \n  \n  if ($job.inputs.sjdbInsertSave != \"None\" && $job.inputs.sjdbGTFfile)
          {\n    new_list = a.concat(b, c, d, e, f, g)\n    return new_list.join(\"
          \")\n  }\n}"
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            if ($job.inputs.twopassMode == "Basic") {
              return "--twopass1readsN ".concat($job.inputs.twopass1readsN)
            }
          }
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            if ($job.inputs.chimOutType == "Within") {
              return "--chimOutType ".concat("Within", $job.inputs.outSAMtype)
            }
            else {
              return "--chimOutType SeparateSAMold"
            }
          }
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  var param_list = []\n  \n  function add_param(key, value){\n
          \   if (value == \"\") {\n      return\n    }\n    else {\n      return
          param_list.push(key.concat(\":\", value))\n    }\n  }\n  \n  add_param('ID',
          \"1\")\n  if ($job.inputs.rg_seq_center) {\n    add_param('CN', $job.inputs.rg_seq_center)\n
          \ } else if ([].concat($job.inputs.reads)[0].metadata.seq_center) {\n    add_param('CN',
          [].concat($job.inputs.reads)[0].metadata.seq_center)\n  }\n  if ($job.inputs.rg_library_id)
          {\n    add_param('LB', $job.inputs.rg_library_id)\n  } else if ([].concat($job.inputs.reads)[0].metadata.library_id)
          {\n    add_param('LB', [].concat($job.inputs.reads)[0].metadata.library_id)\n
          \ }\n  if ($job.inputs.rg_mfl) {\n    add_param('PI', $job.inputs.rg_mfl)\n
          \ } else if ([].concat($job.inputs.reads)[0].metadata.median_fragment_length)
          {\n    add_param('PI', [].concat($job.inputs.reads)[0].metadata.median_fragment_length)\n
          \ }\n  if ($job.inputs.rg_platform) {\n    add_param('PL', $job.inputs.rg_platform.replace(/
          /g,\"_\"))\n  } else if ([].concat($job.inputs.reads)[0].metadata.platform)
          {\n    add_param('PL', [].concat($job.inputs.reads)[0].metadata.platform.replace(/
          /g,\"_\"))\n  }\n  if ($job.inputs.rg_platform_unit_id) {\n    add_param('PU',
          $job.inputs.rg_platform_unit_id)\n  } else if ([].concat($job.inputs.reads)[0].metadata.platform_unit_id)
          {\n    add_param('PU', [].concat($job.inputs.reads)[0].metadata.platform_unit_id)\n
          \ }\n  if ($job.inputs.rg_sample_id) {\n    add_param('SM', $job.inputs.rg_sample_id)\n
          \ } else if ([].concat($job.inputs.reads)[0].metadata.sample_id) {\n    add_param('SM',
          [].concat($job.inputs.reads)[0].metadata.sample_id)\n  }\n  return \"--outSAMattrRGline
          \".concat(param_list.join(\" \"))\n}"
        class: Expression
    - position: 0
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: |-
          {
            if ($job.inputs.sjdbGTFfile && $job.inputs.quantMode) {
              return "--quantMode ".concat($job.inputs.quantMode)
            }
          }
        class: Expression
    - position: 100
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  function sharedStart(array){\n  var A= array.concat().sort(),
          \n      a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;\n  while(i<L &&
          a1.charAt(i)=== a2.charAt(i)) i++;\n  return a1.substring(0, i);\n  }\n
          \ path_list = []\n  arr = [].concat($job.inputs.reads)\n  arr.forEach(function(f){return
          path_list.push(f.path.replace(/\\\\/g,'/').replace( /.*\\//, '' ))})\n  common_prefix
          = sharedStart(path_list)\n  intermediate = common_prefix.replace( /\\-$|\\_$|\\.$/,
          '' ).concat(\"._STARgenome\")\n  source = \"./\".concat(intermediate)\n
          \ destination = intermediate.concat(\".tar\")\n  if ($job.inputs.sjdbGTFfile
          && $job.inputs.sjdbInsertSave && $job.inputs.sjdbInsertSave != \"None\")
          {\n    return \"&& tar -vcf \".concat(destination, \" \", source)\n  }\n}"
        class: Expression
    - position: 0
      prefix: --outFileNamePrefix
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  function sharedStart(array){\n  var A= array.concat().sort(),
          \n      a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;\n  while(i<L &&
          a1.charAt(i)=== a2.charAt(i)) i++;\n  return a1.substring(0, i);\n  }\n
          \ path_list = []\n  arr = [].concat($job.inputs.reads)\n  arr.forEach(function(f){return
          path_list.push(f.path.replace(/\\\\/g,'/').replace( /.*\\//, '' ))})\n  common_prefix
          = sharedStart(path_list)\n  return \"./\".concat(common_prefix.replace(
          /\\-$|\\_$|\\.$/, '' ), \".\")\n}"
        class: Expression
    - position: 101
      separate: yes
      valueFrom:
        engine: '#cwl-js-engine'
        script: "{\n  function sharedStart(array){\n  var A= array.concat().sort(),
          \n      a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;\n  while(i<L &&
          a1.charAt(i)=== a2.charAt(i)) i++;\n  return a1.substring(0, i);\n  }\n
          \ path_list = []\n  arr = [].concat($job.inputs.reads)\n  arr.forEach(function(f){return
          path_list.push(f.path.replace(/\\\\/g,'/').replace( /.*\\//, '' ))})\n  common_prefix
          = sharedStart(path_list)\n  mate1 = common_prefix.replace( /\\-$|\\_$|\\.$/,
          '' ).concat(\".Unmapped.out.mate1\")\n  mate2 = common_prefix.replace( /\\-$|\\_$|\\.$/,
          '' ).concat(\".Unmapped.out.mate2\")\n  mate1fq = mate1.concat(\".fastq\")\n
          \ mate2fq = mate2.concat(\".fastq\")\n  if ($job.inputs.outReadsUnmapped
          == \"Fastx\" && arr.length > 1) {\n    return \"&& mv \".concat(mate1, \"
          \", mate1fq, \" && mv \", mate2, \" \", mate2fq)\n  }\n  else if ($job.inputs.outReadsUnmapped
          == \"Fastx\" && arr.length == 1) {\n    return \"&& mv \".concat(mate1,
          \" \", mate1fq)\n  }\n}"
        class: Expression
    stdin: ''
    stdout: ''
    successCodes: []
    temporaryFailCodes: []
    x: 624.0
    'y': 323
  sbg:x: 700.0
  sbg:y: 200.0
sbg:canvas_zoom: 0.6
sbg:canvas_y: -16
sbg:canvas_x: -41
sbg:batchInput: '#sjdbGTFfile'
sbg:batchBy:
  type: criteria
  criteria:
  - metadata.sample_id
  - metadata.library_id

When you push your app to the platform, you will see the batch available at task page or workflow editor.

Describe Wokrflow in R

Graphic User Interface on Seven Bridges Platform is way more conventient

Import from a JSON file

Yes, you could use the same function convert_app to import json file.

f1 = system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 = convert_app(f1)
## show it
## f1

Utilities for Flow object

Just like Tool object, you also have convenient utils for it, especially useful when you execute task.

f1 = system.file("extdata/app", "flow_star.json", package = "sevenbridges")
f1 = convert_app(f1)
## input matrix
head(f1$input_matrix())
                               id               label    type required
1                    #sjdbGTFfile         sjdbGTFfile File...    FALSE
2                          #fastq               fastq File...     TRUE
3               #genomeFastaFiles    genomeFastaFiles    File     TRUE
4 #sjdbGTFtagExonParentTranscript Exons' parents name  string    FALSE
5       #sjdbGTFtagExonParentGene           Gene name  string    FALSE
6          #winAnchorMultimapNmax    Max loci anchors     int    FALSE
  fileTypes
1      null
2      null
3      null
4      null
5      null
6      null
## by name
head(f1$input_matrix(c("id", "type", "required")))
                               id    type required
1                    #sjdbGTFfile File...    FALSE
2                          #fastq File...     TRUE
3               #genomeFastaFiles    File     TRUE
4 #sjdbGTFtagExonParentTranscript  string    FALSE
5       #sjdbGTFtagExonParentGene  string    FALSE
6          #winAnchorMultimapNmax     int    FALSE
## return only required
head(f1$input_matrix(required = TRUE))
                 id            label    type required fileTypes
2            #fastq            fastq File...     TRUE      null
3 #genomeFastaFiles genomeFastaFiles    File     TRUE      null
## return everything
head(f1$input_matrix(NULL))
                               id    type required fileTypes
1                    #sjdbGTFfile File...    FALSE      null
2                          #fastq File...     TRUE      null
3               #genomeFastaFiles    File     TRUE      null
4 #sjdbGTFtagExonParentTranscript  string    FALSE      null
5       #sjdbGTFtagExonParentGene  string    FALSE      null
6          #winAnchorMultimapNmax     int    FALSE      null
                label                       category stageInput streamable
1         sjdbGTFfile                           null       null      FALSE
2               fastq                           null       null      FALSE
3    genomeFastaFiles                           null       null      FALSE
4 Exons' parents name Splice junctions db parameters       null      FALSE
5           Gene name Splice junctions db parameters       null      FALSE
6    Max loci anchors      Windows, Anchors, Binning       null      FALSE
   sbg.x    sbg.y sbg.includeInPorts
1 160.50 195.0833                 NA
2 164.25 323.7500               TRUE
3 167.75 469.9999                 NA
4 200.00 350.0000                 NA
5 200.00 400.0000                 NA
6 200.00 450.0000                 NA
                                                description
1                                                      <NA>
2                                                      <NA>
3                                                      <NA>
4         Tag name to be used as exons’ transcript-parents.
5               Tag name to be used as exons’ gene-parents.
6 Max number of loci anchors are allowed to map to (int>0).
  sbg.toolDefaultValue
1                 <NA>
2                 <NA>
3                 <NA>
4        transcript_id
5              gene_id
6                   50
                                                link_to
1 #STAR_Genome_Generate.sjdbGTFfile | #STAR.sjdbGTFfile
2                     #SBG_FASTQ_Quality_Detector.fastq
3                #STAR_Genome_Generate.genomeFastaFiles
4  #STAR_Genome_Generate.sjdbGTFtagExonParentTranscript
5        #STAR_Genome_Generate.sjdbGTFtagExonParentGene
6                           #STAR.winAnchorMultimapNmax
## return a output matrix with more informtion
head(f1$output_matrix())
                            id                       label    type
1              #unmapped_reads              unmapped_reads File...
2 #transcriptome_aligned_reads transcriptome_aligned_reads    File
3            #splice_junctions            splice_junctions    File
4              #reads_per_gene              reads_per_gene    File
5                   #log_files                   log_files File...
6          #chimeric_junctions          chimeric_junctions    File
  fileTypes
1      null
2      null
3      null
4      null
5      null
6      null
## return only a few fields
head(f1$output_matrix(c("id", "type")))
                            id    type
1              #unmapped_reads File...
2 #transcriptome_aligned_reads    File
3            #splice_junctions    File
4              #reads_per_gene    File
5                   #log_files File...
6          #chimeric_junctions    File
## return everything
head(f1$output_matrix(NULL))
                            id                       label    type
1              #unmapped_reads              unmapped_reads File...
2 #transcriptome_aligned_reads transcriptome_aligned_reads    File
3            #splice_junctions            splice_junctions    File
4              #reads_per_gene              reads_per_gene    File
5                   #log_files                   log_files File...
6          #chimeric_junctions          chimeric_junctions    File
  fileTypes required                            source streamable
1      null    FALSE              #STAR.unmapped_reads      FALSE
2      null    FALSE #STAR.transcriptome_aligned_reads      FALSE
3      null    FALSE            #STAR.splice_junctions      FALSE
4      null    FALSE              #STAR.reads_per_gene      FALSE
5      null    FALSE                   #STAR.log_files      FALSE
6      null    FALSE          #STAR.chimeric_junctions      FALSE
  sbg.includeInPorts     sbg.x     sbg.y                           link_to
1               TRUE  766.2498 159.58331              #STAR.unmapped_reads
2               TRUE 1118.9998  86.58332 #STAR.transcriptome_aligned_reads
3               TRUE 1282.3330 167.49998            #STAR.splice_junctions
4               TRUE 1394.4164 245.74996              #STAR.reads_per_gene
5               TRUE 1505.0830 322.99995                   #STAR.log_files
6               TRUE 1278.7498 446.74996          #STAR.chimeric_junctions
## flow inputs
f1$input_type()
                   sjdbGTFfile                          fastq 
                     "File..."                      "File..." 
              genomeFastaFiles sjdbGTFtagExonParentTranscript 
                        "File"                       "string" 
      sjdbGTFtagExonParentGene          winAnchorMultimapNmax 
                      "string"                          "int" 
            winAnchorDistNbins 
                         "int" 
## flow outouts
f1$output_type()
             unmapped_reads transcriptome_aligned_reads 
                  "File..."                      "File" 
           splice_junctions              reads_per_gene 
                     "File"                      "File" 
                  log_files          chimeric_junctions 
                  "File..."                      "File" 
        intermediate_genome         chimeric_alignments 
                     "File"                      "File" 
                 sorted_bam                      result 
                     "File"                      "File" 
## list tools
f1$list_tool()
                       label
1       STAR Genome Generate
2 SBG FASTQ Quality Detector
3             Picard SortSam
4                       STAR
                                                  sbgid
1       sevenbridges/public-apps/star-genome-generate/1
2 sevenbridges/public-apps/sbg-fastq-quality-detector/3
3       sevenbridges/public-apps/picard-sortsam-1-140/2
4                       sevenbridges/public-apps/star/4
                           id
1       #STAR_Genome_Generate
2 #SBG_FASTQ_Quality_Detector
3             #Picard_SortSam
4                       #STAR
## f1$get_tool("STAR")

There are more utilities please check example at help(Flow)

Create your own flow in R

Introduction

To create a workflow, we provide simple interface to pipe your tool into a single workflow, it works under situation like

  • Simple linear tool connection and chaining
  • Connect flow with one or more tools

Note for complicated workflow construction, I highly recommend just use our graphical interface to do it, there is no better way.

Connect simple linear tools

Let’s create tools from scratch to perform a simple task

  1. Tool 1 output 1000 random number
  2. Tool 2 take log on it
  3. Tool 3 do a mean calculation of everything
library(sevenbridges)
## A tool that generate a 100 random number
t1 <- Tool(id = "runif new test 3", label = "random number",
           hints = requirements(docker(pull = "rocker/r-base")),
           baseCommand = "Rscript -e 'x = runif(100); write.csv(x, file = 'random.txt', row.names = FALSE)'", 
           outputs = output(id = "random", 
                            type = "file", 
                            glob = "random.txt"))

## A tool that take log
fd <- fileDef(name = "log.R",
              content = "args = commandArgs(TRUE)
                         x = read.table(args[1], header = TRUE)[,'x']
                         x = log(x)
                         write.csv(x, file = 'random_log.txt', row.names = FALSE)
                         ")

t2 <- Tool(id = "log new test 3", label = "get log",
           hints = requirements(docker(pull = "rocker/r-base")),
           requirements = requirements(fd),
           baseCommand = "Rscript log.R", 
           inputs = input(id = "number",
                           type = "file"),
           outputs = output(id = "log", 
                            type = "file", 
                            glob = "*.txt"))

## A tool that do a mean
fd <- fileDef(name = "mean.R",
              content = "args = commandArgs(TRUE)
                         x = read.table(args[1], header = TRUE)[,'x']
                         x = mean(x)
                         write.csv(x, file = 'random_mean.txt', row.names = FALSE)")

t3 <- Tool(id = "mean new test 3", label = "get mean",
           hints = requirements(docker(pull = "rocker/r-base")),
           requirements = requirements(fd),
           baseCommand = "Rscript mean.R", 
           inputs = input(id = "number",
                           type = "file"),
           outputs = output(id = "mean", 
                            type = "file", 
                            glob = "*.txt"))


f = t1 %>>% t2
flow_output: #get_log.log
f = link(t1, t2, "#random", "#number")
flow_output: #get_log.log
## you can  not directly copy-paste it
## please use API to push it, we will register each tool for you. 

# library(clipr)
# write_clip(f$toJSON(pretty = TRUE))


t2 <- Tool(id = "log new test 3", label = "get log",
           hints = requirements(docker(pull = "rocker/r-base")),
           ## requirements = requirements(fd),
           baseCommand = "Rscript log.R", 
           inputs = input(id = "number",
                           type = "file",
                          secondaryFiles = sevenbridges:::set_box(".bai")),
           outputs = output(id = "log", 
                            type = "file", 
                            glob = "*.txt"))

# library(clipr)
# write_clip(t2$toJSON(pretty = TRUE))

Note: this workflow contains tools that do not exist on the platform, so if you directly copy and paste the JSON into the GUI, it won’t work properly, however, a simple way is to push your app to platform via API. This will add new tools one by one to your project before add your workflow app on the platform. Alternative if you connect two tools you know they exist on the platform, you don’t need to do so.

## auto-check tool info and push new tools
p$app_add("new_flow_log", f)

Connecting tools by input and output id

Now let’s connect two tools

  1. unpakcing a compressed fastq
  2. STAR aligner

Checking potential mapping is easy with function link_what, it will print matched input and outputs. Then the generic function link will allow you to connect two Tool objects

If you don’t specify which input/ouput to expose at flow level for new Flow object, it will expose all availabl ones and print the message, otherwise, please provide parameters for flow_input and flow_output with full id.

t1 = system.file("extdata/app", "tool_unpack_fastq.json", 
                 package = "sevenbridges")
t2 = system.file("extdata/app", "tool_star.json", 
                 package = "sevenbridges")
t1 = convert_app(t1)
t2 = convert_app(t2)
## check possible link
link_what(t1, t2)
$File...
$File...$from
                   id              label    type fileTypes
1 #output_fastq_files Output FASTQ files File...     FASTQ
           full.name
1 #SBG_Unpack_FASTQs

$File...$to
             id                label    type required prefix
1        #reads        Read sequence File...     TRUE   <NA>
95 #sjdbGTFfile Splice junction file File...    FALSE   <NA>
                                                  fileTypes full.name
1  FASTA, FASTQ, FA, FQ, FASTQ.GZ, FQ.GZ, FASTQ.BZ2, FQ.BZ2     #STAR
95                                            GTF, GFF, TXT     #STAR
## link
f1 = link(t1, t2, "output_fastq_files", "reads")
flow_input: #SBG_Unpack_FASTQs.input_archive_file / #STAR.sjdbGTFfile / #STAR.genome
flow_output: #STAR.aligned_reads / #STAR.transcriptome_aligned_reads / #STAR.reads_per_gene / #STAR.log_files / #STAR.splice_junctions / #STAR.chimeric_junctions / #STAR.unmapped_reads / #STAR.intermediate_genome / #STAR.chimeric_alignments
## link
t1$output_id(TRUE)
                                File... 
"#SBG_Unpack_FASTQs.output_fastq_files" 
t2$input_id(TRUE)
                                 File... 
                           "#STAR.reads" 
                                    enum 
              "#STAR.readMatesLengthsIn" 
                                     int 
                   "#STAR.readMapNumber" 
                                     int 
               "#STAR.limitOutSJoneRead" 
                                     int 
             "#STAR.limitOutSJcollapsed" 
                                    enum 
                "#STAR.outReadsUnmapped" 
                                     int 
              "#STAR.outQSconversionAdd" 
                                    enum 
                      "#STAR.outSAMtype" 
                                    enum 
                  "#STAR.outSortingType" 
                                    enum 
                      "#STAR.outSAMmode" 
                                    enum 
               "#STAR.outSAMstrandField" 
                                    enum 
                "#STAR.outSAMattributes" 
                                    enum 
                  "#STAR.outSAMunmapped" 
                                    enum 
                     "#STAR.outSAMorder" 
                                    enum 
               "#STAR.outSAMprimaryFlag" 
                                    enum 
                    "#STAR.outSAMreadID" 
                                     int 
                "#STAR.outSAMmapqUnique" 
                                     int 
                    "#STAR.outSAMflagOR" 
                                     int 
                   "#STAR.outSAMflagAND" 
                                  string 
                  "#STAR.outSAMheaderHD" 
                                  string 
                  "#STAR.outSAMheaderPG" 
                                  string 
                   "#STAR.rg_seq_center" 
                                  string 
                   "#STAR.rg_library_id" 
                                  string 
                          "#STAR.rg_mfl" 
                                    enum 
                     "#STAR.rg_platform" 
                                  string 
             "#STAR.rg_platform_unit_id" 
                                  string 
                    "#STAR.rg_sample_id" 
                                    enum 
                   "#STAR.outFilterType" 
                                     int 
     "#STAR.outFilterMultimapScoreRange" 
                                     int 
           "#STAR.outFilterMultimapNmax" 
                                     int 
           "#STAR.outFilterMismatchNmax" 
                                   float 
      "#STAR.outFilterMismatchNoverLmax" 
                                   float 
  "#STAR.outFilterMismatchNoverReadLmax" 
                                     int 
               "#STAR.outFilterScoreMin" 
                                   float 
      "#STAR.outFilterScoreMinOverLread" 
                                     int 
              "#STAR.outFilterMatchNmin" 
                                   float 
     "#STAR.outFilterMatchNminOverLread" 
                                    enum 
           "#STAR.outFilterIntronMotifs" 
                                    enum 
                "#STAR.outSJfilterReads" 
                                  int... 
          "#STAR.outSJfilterOverhangMin" 
                                  int... 
       "#STAR.outSJfilterCountUniqueMin" 
                                  int... 
        "#STAR.outSJfilterCountTotalMin" 
                                  int... 
     "#STAR.outSJfilterDistToOtherSJmin" 
                                  int... 
     "#STAR.outSJfilterIntronMaxVsReadN" 
                                     int 
                        "#STAR.scoreGap" 
                                     int 
                  "#STAR.scoreGapNoncan" 
                                     int 
                    "#STAR.scoreGapGCAG" 
                                     int 
                    "#STAR.scoreGapATAC" 
                                   float 
     "#STAR.scoreGenomicLengthLog2scale" 
                                     int 
                    "#STAR.scoreDelOpen" 
                                     int 
                    "#STAR.scoreDelBase" 
                                     int 
                    "#STAR.scoreInsOpen" 
                                     int 
                    "#STAR.scoreInsBase" 
                                     int 
              "#STAR.scoreStitchSJshift" 
                                     int 
             "#STAR.seedSearchStartLmax" 
                                   float 
    "#STAR.seedSearchStartLmaxOverLread" 
                                     int 
                  "#STAR.seedSearchLmax" 
                                     int 
                "#STAR.seedMultimapNmax" 
                                     int 
                 "#STAR.seedPerReadNmax" 
                                     int 
               "#STAR.seedPerWindowNmax" 
                                     int 
           "#STAR.seedNoneLociPerWindow" 
                                     int 
                  "#STAR.alignIntronMin" 
                                     int 
                  "#STAR.alignIntronMax" 
                                     int 
                "#STAR.alignMatesGapMax" 
                                     int 
              "#STAR.alignSJoverhangMin" 
                                     int 
            "#STAR.alignSJDBoverhangMin" 
                                     int 
         "#STAR.alignSplicedMateMapLmin" 
                                   float 
"#STAR.alignSplicedMateMapLminOverLmate" 
                                   float 
         "#STAR.alignWindowsPerReadNmax" 
                                     int 
   "#STAR.alignTranscriptsPerWindowNmax" 
                                     int 
     "#STAR.alignTranscriptsPerReadNmax" 
                                    enum 
                   "#STAR.alignEndsType" 
                                    enum 
    "#STAR.alignSoftClipAtReferenceEnds" 
                                     int 
           "#STAR.winAnchorMultimapNmax" 
                                     int 
                     "#STAR.winBinNbits" 
                                     int 
              "#STAR.winAnchorDistNbins" 
                                     int 
                   "#STAR.winFlankNbins" 
                                     int 
                  "#STAR.chimSegmentMin" 
                                     int 
                    "#STAR.chimScoreMin" 
                                     int 
                "#STAR.chimScoreDropMax" 
                                     int 
             "#STAR.chimScoreSeparation" 
                                     int 
        "#STAR.chimScoreJunctionNonGTAG" 
                                     int 
         "#STAR.chimJunctionOverhangMin" 
                                    enum 
                       "#STAR.quantMode" 
                                     int 
                  "#STAR.twopass1readsN" 
                                    enum 
                     "#STAR.twopassMode" 
                                  string 
                   "#STAR.genomeDirName" 
                                    enum 
                  "#STAR.sjdbInsertSave" 
                                  string 
                "#STAR.sjdbGTFchrPrefix" 
                                  string 
              "#STAR.sjdbGTFfeatureExon" 
                                  string 
  "#STAR.sjdbGTFtagExonParentTranscript" 
                                  string 
        "#STAR.sjdbGTFtagExonParentGene" 
                                     int 
                    "#STAR.sjdbOverhang" 
                                     int 
                       "#STAR.sjdbScore" 
                                 File... 
                     "#STAR.sjdbGTFfile" 
                                  int... 
                    "#STAR.clip3pNbases" 
                                  int... 
                    "#STAR.clip5pNbases" 
                               string... 
                "#STAR.clip3pAdapterSeq" 
                                float... 
                "#STAR.clip3pAdapterMMp" 
                                  int... 
        "#STAR.clip3pAfterAdapterNbases" 
                                    enum 
                     "#STAR.chimOutType" 
                                    File 
                          "#STAR.genome" 
                                     int 
              "#STAR.limitSjdbInsertNsj" 
                                    enum 
           "#STAR.quantTranscriptomeBan" 
                                     int 
                 "#STAR.limitBAMsortRAM" 
f2 = link(t1, t2, "output_fastq_files", "reads",
          flow_input = "#SBG_Unpack_FASTQs.input_archive_file",
          flow_output = "#STAR.log_files")
flow_input: #SBG_Unpack_FASTQs.input_archive_file / #STAR.genome
flow_output: #STAR.log_files
# library(clipr)
# write_clip(f2$toJSON())

Connecting tool with workflow by input and output id

tool.in = system.file("extdata/app", "tool_unpack_fastq.json", package = "sevenbridges")
flow.in = system.file("extdata/app", "flow_star.json", package = "sevenbridges")

t1 = convert_app(tool.in)
f2 = convert_app(flow.in)
## consulting link_what first
f2$link_map()
                                                     id
1  #STAR_Genome_Generate.sjdbGTFtagExonParentTranscript
2        #STAR_Genome_Generate.sjdbGTFtagExonParentGene
3                     #STAR_Genome_Generate.sjdbGTFfile
4                #STAR_Genome_Generate.genomeFastaFiles
5                     #SBG_FASTQ_Quality_Detector.fastq
6                             #Picard_SortSam.input_bam
7                           #STAR.winAnchorMultimapNmax
8                              #STAR.winAnchorDistNbins
9                                     #STAR.sjdbGTFfile
10                                          #STAR.reads
11                                         #STAR.genome
12                                      #unmapped_reads
13                         #transcriptome_aligned_reads
14                                    #splice_junctions
15                                      #reads_per_gene
16                                           #log_files
17                                  #chimeric_junctions
18                                 #intermediate_genome
19                                 #chimeric_alignments
20                                          #sorted_bam
21                                              #result
                               source   type
1     #sjdbGTFtagExonParentTranscript  input
2           #sjdbGTFtagExonParentGene  input
3                        #sjdbGTFfile  input
4                   #genomeFastaFiles  input
5                              #fastq  input
6                 #STAR.aligned_reads  input
7              #winAnchorMultimapNmax  input
8                 #winAnchorDistNbins  input
9                        #sjdbGTFfile  input
10 #SBG_FASTQ_Quality_Detector.result  input
11       #STAR_Genome_Generate.genome  input
12               #STAR.unmapped_reads output
13  #STAR.transcriptome_aligned_reads output
14             #STAR.splice_junctions output
15               #STAR.reads_per_gene output
16                    #STAR.log_files output
17           #STAR.chimeric_junctions output
18          #STAR.intermediate_genome output
19          #STAR.chimeric_alignments output
20         #Picard_SortSam.sorted_bam output
21 #SBG_FASTQ_Quality_Detector.result output
## then link

f3 = link(t1, f2, c("output_fastq_files"), c("#SBG_FASTQ_Quality_Detector.fastq"))

link_what(f2, t1)
$File
$File$from
                             id                       label type required
2  #transcriptome_aligned_reads transcriptome_aligned_reads File    FALSE
3             #splice_junctions            splice_junctions File    FALSE
4               #reads_per_gene              reads_per_gene File    FALSE
6           #chimeric_junctions          chimeric_junctions File    FALSE
7          #intermediate_genome         intermediate_genome File    FALSE
8          #chimeric_alignments         chimeric_alignments File    FALSE
9                   #sorted_bam                  sorted_bam File    FALSE
10                      #result                      result File    FALSE
   fileTypes                            link_to
2       null  #STAR.transcriptome_aligned_reads
3       null             #STAR.splice_junctions
4       null               #STAR.reads_per_gene
6       null           #STAR.chimeric_junctions
7       null          #STAR.intermediate_genome
8       null          #STAR.chimeric_alignments
9       null         #Picard_SortSam.sorted_bam
10      null #SBG_FASTQ_Quality_Detector.result

$File$to
                   id              label type required
1 #input_archive_file Input archive file File     TRUE
                prefix                                      fileTypes
1 --input_archive_file TAR, TAR.GZ, TGZ, TAR.BZ2, TBZ2,  GZ, BZ2, ZIP
f4 = link(f2, t1, c("#Picard_SortSam.sorted_bam", "#SBG_FASTQ_Quality_Detector.result"), c("#input_archive_file", "#input_archive_file"))
flow_input: #SBG_Unpack_FASTQs.input_archive_file
flow_output: #SBG_Unpack_FASTQs.output_fastq_files
## todo 
## all outputs
## flow + flow
## print message when name wrong
# library(clipr)
# write_clip(f4$toJSON())

Using pipe to construct complicated workflow

Execution

Execute the tool and flow in the cloud

With API function, you can directly load your Tool into the account. Run a task, for “how-to”, please check the API complete guide

Here is quick demo

a = Auth(url = "api_url", token = "your_token")
p = a$project("demo")
app.runif = p$app_add("runif555", rbx)
aid = app.runif$id
tsk = p$task_add(name = "Draft runif simple", 
           description = "Description for runif", 
           app = aid,
           inputs = list(min = 1, max = 10))
tsk$run()

Execute the tool in Rabix - test locally

1. from CLI
While developing tools it is useful to test them locally first. For that we can use rabix - reproducible analyses for bioinformatics, https://github.com/rabix. To test your tool with latest implementation of rabix in Java (called bunny) you could use docker image tengfei/testenv:

docker pull tengfei/testenv

Dump your rabix tool as json into dir which also contains input data. write(rbx$toJSON, file="<data_dir>/<tool>.json"). Make inputs.json file to declare input parameters in the same directory (you can use relative paths from inputs.json to data). Create container:

docker run --privileged --name bunny -v </path/to/data_dir>:/bunny_data -dit tengfei/testenv

Execute tool

docker exec bunny bash -c 'cd /opt/bunny && ./rabix.sh -e /bunny_data /bunny_data/<tool>.json /bunny_data/inputs.json'

You’ll see running logs from within container, and also output dir inside in home system.

NOTE: tengfei/testenv has R, python, Java… so many tools can work without docker requirement set. If you however set docker requirement you need to pull image inside container first to run docker container inside running bunny docker.
NOTE: inputs.json can also be inputs.yaml if you find it easier to declare inputs in YAML.

2. from R

library(sevenbridges)

in.df <- data.frame(id = c("number", "min", "max", "seed"),
                    description = c("number of observation", 
                                    "lower limits of the distribution",
                                    "upper limits of the distribution",
                                    "seed with set.seed"),
                    type = c("integer", "float", "float", "float"),
                    label = c("number" ,"min", "max", "seed"), 
                    prefix = c("--n", "--min", "--max", "--seed"),
                    default = c(1, 0, 10, 123), 
                    required = c(TRUE, FALSE, FALSE, FALSE))
out.df <- data.frame(id = c("random", "report"),
                     type = c("file", "file"),
                     glob = c("*.txt", "*.html"))
rbx <- Tool(id = "runif",
            label = "Random number generator",
            hints = requirements(docker(pull = "tengfei/runif"), 
                                 cpu(1), mem(2000)),
            baseCommand = "runif.R",
            inputs = in.df, ## or ins.df
            outputs = out.df)
params <- list(number=3, max=5)

set_test_env("tengfei/testenv", "mount_dir")
test_tool(rbx, params)