avworkflows {AnVIL} | R Documentation |
avworkflows()
returns a tibble summarizing available
workflows.
avworkflow_jobs()
returns a tibble summarizing
submitted workflow jobs for a namespace and name.
avworkflow_files()
returns a tibble containing
information and file paths to workflow outputs.
avworkflow_localize()
creates or synchronizes a
local copy of files with files stored in the workspace bucket
and produced by the workflow.
avworkflow_configuration_template()
returns a
template for defining workflow configurations. This template
can be used as a starting point for providing a custom
configuration.
avworkflow_configuration()
returns a list structure
describing an existing workflow configuration.
avworkflow_import_configuration()
updates an
existing configuration, e.g., changing inputs to the workflow.
avworkflows(namespace = avworkspace_namespace(), name = avworkspace_name()) avworkflow_jobs(namespace = avworkspace_namespace(), name = avworkspace_name()) avworkflow_files(submissionId = NULL, bucket = avbucket()) avworkflow_localize( submissionId = NULL, destination = NULL, type = c("control", "output", "all"), bucket = avbucket(), dry = TRUE ) avworkflow_configuration_template() avworkflow_configuration( configuration_namespace, configuration_name, namespace = avworkspace_namespace(), name = avworkspace_name() ) avworkflow_import_configuration( config, namespace = avworkspace_namespace(), name = avworkspace_name() )
namespace |
character(1) AnVIL workspace namespace as returned
by, e.g., |
name |
character(1) AnVIL workspace name as returned by, eg.,
|
submissionId |
a character() of workflow submission ids, or a
tibble with column |
bucket |
character(1) name of the google bucket in which the
workflow products are available, as |
destination |
character(1) file path to the location where
files will be synchronized. For directories in the current
working directory, be sure to prepend with |
type |
character(1) copy |
dry |
logical(1) when |
configuration_namespace |
character(1) namespace of the
workflow. Often the same as the namespace of the
workspace. Discover configuration namespace and name
information from |
configuration_name |
character(1) name of the workflow, from
|
config |
a named list describing the full configuration, e.g.,
created from editing the return value of
|
For avworkflow_files()
, the submissionId
is the
identifier associated with the workflow job, and is present in
the return value of avworkflow_jobs()
; the example
illustrates how the first row of avworkflow_jobs()
(i.e., the
most recenltly completed workflow) can be used as input to
avworkflow_files()
. When submissionId
is not provided, the
return value is for the most recently submitted workflow of the
namespace and name of avworkspace()
.
avworkflow_localize()
. type = "control"
files
summarize workflow progress; they can be numerous but are
frequently small and quickly syncronized. type = "output"
files are the output products of the workflow stored in the
workspace bucket. Depending on the workflow, outputs may be
large, e.g., aligned reads in bam files. See gsutil_cp()
to
copy individual files from the bucket to the local drive.
`avworkflow_localize()` treats `submissionId=` in the same way as `avworkflow_files()`: when missing, files from the most recent workflow job are candidates for localization.
avworkflows()
returns a tibble. Each workflow is in a
'namespace' and has a 'name', as illustrated in the
example. Columns are
name: workflow name.
namespace: workflow namespace (often the same as the workspace namespace).
rootEntityType: name of the avtable()
used to retrieve inputs.
methodRepoMethod.methodUri: source of the method, e.g., a dockstore URI.
methodRepoMethod.sourceRepo: source repository, e.g., dockstore.
methodRepoMethod.methodPath: path to method, e.g., a dockerstore method might reference a github repository.
methodRepoMethod.methodVersion: the version of the method, e.g., 'main' branch of a github repository.
avworkflow_jobs()
returns a tibble, sorted by
submissionDate
, with columns
submissionId character() job identifier from the workflow runner.
submitter character() AnVIL user id of individual submitting the job.
submissionDate POSIXct() date (in local time zone) of job submission.
status character() job status, with values 'Accepted' 'Evaluating' 'Submitting' 'Submitted' 'Aborting' 'Aborted' 'Done'
succeeded integer() number of workflows succeeding.
failed integer() number of workflows failing.
avworkflow_files()
returns a tibble with columns
file: character() 'base name' of the file in the bucket.
workflow: character() name of the workflow the file is associated with.
task: character() name of the task in the workflow that generated the file.
path: charcter() full path to the file in the google bucket.
avworkflow_localize()
prints a message indicating the
number of files that are (if dry = FALSE
) or would be
localized. If no files require localization (i.e., local files
are not older than the bucket files), then no files are
localized. avworkflow_localize()
returns a tibble of file
name and bucket path of files to be synchronized.
avworkflow_configuration_template()
returns a list
providing a template for configuration lists, with the
following structure:
namespace character(1) configuration namespace.
name character(1) configuration name.
rootEntityType character(1) or missing. the name of the table
(from avtables()
) containing the entitites referenced in
inputs, etc., by the keyword 'this.'
prerequisites named list (possibly empty) of prerequisites.
inputs named list (possibly empty) of inputs. Form of input
depends on method, and might include, e.g., a reference to a
field in a table referenced by avtables()
or a character string
defining an input constant.
outputs named list (possibly empty) of outputs.
methodConfigVersion integer(1) identifier for the method configuration.
methodRepoMethod named list describing the method, with
character(1) elements described in the return value for avworkflows()
.
methodUri
sourceRepo
methodPath
methodVersion. The REST specification indicates that this has
type integer
, but the documentation indicates either
integer
or string
.
deleted logical(1) of uncertain purpose.
The exact format of the configuration is important.
One common problem is that a scalar character vector `"bar"` is interpretted as a json 'array' `["bar"]` rather than a json string `"bar"`. Enclose the string with `jsonlite::unbox("bar")` in the configuration list if the length 1 character vector in R is to be interpretted as a json string. A second problem is that an unquoted unboxed character string `unbox("foo")` is required by AnVIL to be quoted. This is reported as a warning() about invalid inputs or outputs, and the solution is to provide a quoted string `unbox('"foo"')`.
avworkflow_configuration()
returns a list structure
describing the configuration. See
avworkflow_configuration_template()
for the structure of a
typical workflow.
avworkflow_import_configuration()
returns an object
describing the updated configuration. The return value includes
invalid or unused elements of the config
input. Invalid or
unused elements of config
are also reported as a warning.
if (gcloud_exists() && nzchar(avworkspace_name())) ## from within AnVIL avworkflows() %>% select(namespace, name) if (gcloud_exists() && nzchar(avworkspace_name())) ## from within AnVIL avworkflow_jobs() if (gcloud_exists() && nzchar(avworkspace_name())) { ## e.g., from within AnVIL avworkflow_jobs() %>% ## select most recent workflow head(1) %>% ## find paths to output and log files on the bucket avworkflow_files() } if (gcloud_exists() && nzchar(avworkspace_name())) { avworkflow_localize(dry = TRUE) } avworkflow_configuration_template() ## Not run: config <- avworkflow_configuration("bioconductor-anvil-rpci", "AnVILBulkRNASeq") str(config) ## End(Not run) ## Not run: avworkflow_import_configuration(config) ## End(Not run)