This package is designed to support as many Seven Bridges supported platforms as possible, including the NCI Cancer Genomic Cloud Pilot developed by Seven Bridges, make sure you provide the correct API URL for the platform you are using.
Currently tested platform including
To read complete platform documentation
Full API documentation on the platform
Current public V2 API is CWL compatable, public API url:
sevenbbridges
package only support V2 and later API, which support CWL compatible project. It provides simple interface for easy access and friendly showing methos. You will learn them through the tutorials.
For advanced users, you can directly use httr
package to construct your API calls, or you can still use low level API call for all APIs like this, the most used arguments are “path”, “query”, “body”.
For example, when you read API documentatio you will see a section called “list all your projects”, it tells you to use method “get” and path “/projects”, so you can simply call
library(sevenbridges)
a <- Auth(token = "8c3329a4de664c35bb657499bb2f335c",
url = "https://api.sbgenomics.com/v2/")
a$api(path = "project", method = "GET")
you can also pass query and body as a list.
With this package, you can simply call
a$project()
Before we continue, there are couple things you may want to keep it in mind
offset
and limit
, offset defines where the retrieved items started, and limit defines quantity of items you want to get, and by default it’s always offset = 0
and limit = 100
, which means the first 100 items. This applies when you want to list items or search items by a name matching. To force the search and list for all items, please use complete = TRUE
in your call. By default complete = FALSE
that it always perform those search and list operations for particular offset and limit.exact = TRUE
.sevenbridges
package is now on Bioconductor released and devel branch.
## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("sevenbridges")
Our API keep improving, please also visit our github homepage for most recent news and for latest version.
If you don’t have devtools
This require you have devtools
package, install it from CRAN if you don’t have it
install.packages("devtools")
You may got an error and need system dependecies sometimes for curl and ssl, for example, in ubuntu you probably need to do this first in order to install devtools
and in order to build vigenttes (you need pandoc)
apt-get update
apt-get install libcurl4-gnutls-dev libssl-dev pandoc pandoc-citeproc
If devtools is already installed
Now install latest version from github for sevenbridges
source("http://bioconductor.org/biocLite.R")
biocLite(c("readr", "BiocStyle"))
library(devtools)
install_github("sbg/sevenbridges-r", build_vignettes=TRUE,
repos=BiocInstaller::biocinstallRepos(),
dependencies=TRUE)
If you have trouble with pandoc and don’t want to install pandoc, set build_vignettes = FALSE
to avoid vignettes build.
For more details about how to use the API client in R, please go for the second section for complete guide. This section, I am going to use a simple example for a quick start.
Everything starts from an Auth
object, so let’s set up the Auth
object, it remembers your auth token and url, every action started from this object.
You have three different ways to setup the token.
Auth
function, explicityly setup your token and API url.Load the library first
library(sevenbridges)
This is most common way to construct your Auth object
## direct setup
(a <- Auth(token = "<fake_token>",
url = "https://cgc-api.sbgenomics.com/v2/"))
## or load default from config file (autoloaded into options)
== Auth ==
token : <fake_token>
url : https://cgc-api.sbgenomics.com/v2/
Or loaded from your configuration file
## from platfrom "us" for user "tengfei"
a <- Auth(platform = "us", username = "tengfei")
or update Auth object from another config file
updateAuthList("new_config.yml")
This call returns information about your account.
a$user()
== User ==
href : https://cgc-api.sbgenomics.com/v2/users/tengfei
username : tengfei
email : tengfei.yin@sbgenomics.com
first_name : Tengfei
last_name : Yin
affiliation : Seven Bridges Genomics
country : United States
To list user resrouces, This call returns information about the specified user. Note that currently you can view only your own user information, and so this call is equivalent to the call to Get my information.
a$user(username = "tengfei")
This call returns information about your current rate limit. This is the number of API calls you can make in one hour.
a$rate_limit()
== Rate Limit ==
limit : 1000
remaining : 993
reset : 1457980957
Billing information, every project is associated with a billing group
## check your billing info
a$billing()
a$invoice()
For more information, use breakdown = TRUE
a$billing(id = "your_billing_id", breakdown = TRUE)
Create a new project called “api testing”, with the billing group id.
## get billing group id
bid <- a$billing()$id
## create new project
(p <- a$project_new(name = "api testing", bid, description = "Just a testing"))
== Project ==
id : tengfei/api-testing
name : api testing
description : Just a testing
billing_group_id : <fake_bid>
type : v2
-- Permission --
## list first 100
a$project()
## list all
a$project(complete = TRUE)
## return all named match "demo"
a$project(name = "demo", complete = TRUE)
## get the project you want by id
p = a$project(id = "tengfei/api-tutorial")
To find out avaialbe public apps, you can
## search by name matching, complete = TRUE search all apps, not
## limited by offset or limit.
a$public_app(name = "STAR", complete = TRUE)
## search by id is accurate
a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star/5")
## you can also get everything
a$public_app(complete = TRUE)
## default limit = 100, offset = 0 which means first 100
a$public_app()
Now, from your Auth
object, you copy an App id
into your project
id with a new name
, following this logic.
## copy
a$copy_app(id = "admin/sbg-public-data/rna-seq-alignment-star/5",
project = "tengfei/api-testing", name = "new copy of star")
## check its' copyed
p = a$project(id = "tengfei/api-testing")
## list apps your got in your project
p$app()
The short name is changed to “newcopyofstar”
== App ==
id : tengfei/api-testing/newcopyofstar/0
name : RNA-seq Alignment - STAR
project : tengfei/api-testing-2
revision : 0
Alternatively you can copy from app object
app = a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star")
app$copy_to(project = "tengfei/api-testing",
name = "copy of star")
You can also upload your own CWL json file that describe your app to your project.
Note: alternatively you can directly describe your CWL tool in R with this package, please read another vignettes on “Describe CWL Tools/Workflows in R and Execution”
## Add an CWL file to your project
f.star = system.file("extdata/app", "flow_star.json", package = "sevenbridges")
app = p$app_add("starlocal", fl.runif)
(aid <- app$id)
You get an app id like this
"tengfei/api-testing/starlocal/0"
It’s composed of
Alternatively, you can describe tools in R directly, you can learn this later, feel free to go to next section
fl <- system.file("docker", "sevenbridges/rabix/generator.R", package = "sevenbridges")
cat(readLines(fl), sep = '\n')
library(sevenbridges)
in.lst <- list(input(id = "number",
description = "number of observations",
type = "integer",
label = "number",
prefix = "--n",
default = 1,
required = TRUE,
cmdInclude = TRUE),
input(id = "min",
description = "lower limits of the distribution",
type = "float",
label = "min",
prefix = "--min",
default = 0),
input(id = "max",
description = "upper limits of the distribution",
type = "float",
label = "max",
prefix = "--max",
default = 1),
input(id = "seed",
description = "seed with set.seed",
type = "float",
label = "seed",
prefix = "--seed",
default = 1))
## the same method for outputs
out.lst <- list(output(id = "random",
type = "file",
label = "output",
description = "random number file",
glob = "*.txt"),
output(id = "report",
type = "file",
label = "report",
glob = "*.html"))
rbx <- Tool(id = "runif",
label = "Random number generator",
hints = requirements(docker(pull = "tengfei/runif"),
cpu(1), mem(2000)),
baseCommand = "runif.R",
inputs = in.lst, ## or ins.df
outputs = out.lst)
fl <- "inst/docker/sevenbridges/rabix/runif.json"
write(rbx$toJSON(pretty = TRUE), fl)
And add it like this
## rbx is the object returned by Tool function
app = p$app_add("runif", rbx)
(aid <- app$id)
Please read another tutorial about how to describe tools and flows in R.
Now assume you already copied the public app “admin/sbg-public-data/rna-seq-alignment-star/5” into your project “tengfei/api-testing” , the app id in your current project is “tengfei/api-testing/newcopyofstar” Or if you already have an app to run in your project.
To draft a new task, you need to specify
You can always go to the UI to check App details or task input requirements, but how to do it in R?
To check your inputs name and types you need to get an App
object first.
app = a$app(id = "tengfei/api-testing-2/newcopyofstar")
## get input matrix
app$input_matrix()
app$input_matrix(c("id", "label", "type"))
app$input_matrix(c("id", "label", "type"), required = TRUE)
## get required input names and types only
app$get_required()
Or loaded from a CWL JSON and convert it into a R object first.
f1 = system.file("extdata/app", "flow_star.json", package = "sevenbridges")
app = convert_app(f1)
## get input matrix
app$input_matrix()
## id label type
## 1 #sjdbGTFfile sjdbGTFfile File...
## 2 #fastq fastq File...
## 3 #genomeFastaFiles genomeFastaFiles File
## 4 #sjdbGTFtagExonParentTranscript Exons' parents name string
## 5 #sjdbGTFtagExonParentGene Gene name string
## 6 #winAnchorMultimapNmax Max loci anchors int
## 7 #winAnchorDistNbins Max bins between anchors int
## required fileTypes
## 1 FALSE null
## 2 TRUE null
## 3 TRUE null
## 4 FALSE null
## 5 FALSE null
## 6 FALSE null
## 7 FALSE null
app$input_matrix(c("id", "label", "type"))
## id label type
## 1 #sjdbGTFfile sjdbGTFfile File...
## 2 #fastq fastq File...
## 3 #genomeFastaFiles genomeFastaFiles File
## 4 #sjdbGTFtagExonParentTranscript Exons' parents name string
## 5 #sjdbGTFtagExonParentGene Gene name string
## 6 #winAnchorMultimapNmax Max loci anchors int
## 7 #winAnchorDistNbins Max bins between anchors int
app$input_matrix(c("id", "label", "type"), required = TRUE)
## id label type
## 2 #fastq fastq File...
## 3 #genomeFastaFiles genomeFastaFiles File
## get required input names and types only
app$get_required()
## fastq genomeFastaFiles
## "File..." "File"
For what it returned, the names is the one you want to use as your task input list. Task inputs need to match the expected data type and name, requried has to be provided. In above example, we see two required field
We also want to provide gene feature file
Note some other input types includes:
Files
object (for sinlge file input) or FilesList
object (for input accept more than one files) or simply a list of “Files” object. Get your file by id
or by name
with exact = TRUE
is always accurate. Sounds complicated let’s just see an example.fastqs <- c("SRR1039508_1.fastq", "SRR1039508_2.fastq")
## get all 2 exact files
fastq_in = p$file(name= fastqs, exact = TRUE)
## get single file
fasta_in = p$file(name = "Homo_sapiens.GRCh38.dna.primary_assembly.fa",
exact = TRUE)
## get all single file
gtf_in = p$file(name = "Homo_sapiens.GRCh38.84.gtf",
exact = TRUE)
## Add new tasks
taskName = paste0("tengfei_star-alignment ",date())
tsk = p$task_add(name = taskName,
description = "star test",
app = "tengfei/api-testing-2/newcopyofstar/0",
inputs = list(sjdbGTFfile = gtf_in,
fastq = fastq_in,
genomeFastaFiles = fasta_in))
Remember “fastq” expect a list of files. You can also do something like
f1 = p$file(name = "SRR1039508_1.fastq", exact = TRUE)
f2 = p$file(name = "SRR1039508_2.fastq", exact = TRUE)
## get all 2 exact files
fastq_in = list(f1, f2)
## or if you know you only have 2 file name matching SRR924146*.fastq
fastq_in = p$file(name = "SRR1039508*.fastq", complete = TRUE)
Using complete = TRUE
when items is over 100.
Now let’s do a batch with 8 files in 4 group, which is batch by metadata sample_id and library_id, assume each file has these two metadata fields in the system. And it could be evenly grouped into 4. So we will have a single parent task with 4 sub tasks.
fastqs <- c("SRR1039508_1.fastq", "SRR1039508_2.fastq", "SRR1039509_1.fastq",
"SRR1039509_2.fastq", "SRR1039512_1.fastq", "SRR1039512_2.fastq",
"SRR1039513_1.fastq", "SRR1039513_2.fastq")
## get all 8 files
fastq_in = p$file(name= fastqs, exact = TRUE)
## can also try to returned all SRR*.fastq files
## fastq_in = p$file(name= "SRR*.fastq", complete = TRUE)
tsk = p$task_add(name = taskName,
description = "Batch Star Test",
app = "tengfei/api-testing-2/newcopyofstar/0",
batch = batch(input = "fastq",
criteria = c("metadata.sample_id","metadata.noexist_id")),
inputs = list(sjdbGTFfile = gtf_in,
fastq = fastqs_in,
genomeFastaFiles = fasta_in))
Now you have a draft batch task, please check it out in the UI.
Now run it.
## Run your task
tsk$run()
Before you run it, you can delete your draft task or update it.
## not run
## tsk$delete()
After you run a task, you can abort it
## Abort your task
tsk$abort()
If you want to update your task then re-run.
tsk$getInputs()
## missing number input, only update number
tsk$update(inputs = list(sjdbGTFfile = "some new file"))
## double check
tsk$getInputs()
To monitor the task, you can always call update
on task object to check the status.
tsk$update()
Or more fun, you can monitor a running task with hook function, so trigger a function when that status is “completed”, “running” etc, please check the details in section about hook of task.
By default it just show message when the task is completed.
## Monitor your task (skip this part)
## tsk$monitor()
To download all files from a completed tasks
tsk$download("~/Downloads")
More fun to set task hook, setTaskHook
: connect a function call to the status of a task, when you run tsk$monitor(time = 30) it will check that task every 30 seconds for the task running status by api call for following status, (“queued”, “draft”, “running”, “completed”, “aborted”, “failed”) and it triggered the function call based on returned status and getTaskHook
return the function call for specific status
By default when you monitor a running task, it’s only printing status and exit when it’s completed.
getTaskHook("completed")
## function (...)
## {
## cat("\r", "completed")
## return(TRUE)
## }
## <environment: 0x600ed00>
If you want to customize the monitor function, there is a requirements
TRUE
or FALSE
in the end. When it’s TRUE
(or non-logical value) it means the monitoring will be terminated after status matched and function execution, for example when task is completed. When it’s FALSE
it means the monitoring will continue for next iteration of checking, e.g. when it’s “runing”, you want to keep tracking.To set a new function to monitor the status “completed”, when it’s completed, download all task output files to local folder.
setTaskHook("completed", function(){
tsk$download("~/Downloads")
return(TRUE)
})
tsk$monitor()
This is what the package try to help, and provide a user-friendly interface that we suggest our users to use, so you don’t have to combine several api()
calls and refer to the API documentation all the times to finish a simple task.
You can create a file called ‘.sbg.auth.yml’ in your home folder, and maintain multiple account for a list of platforms, including private or public ones.
us:
url: https://api.sbgenomics.com/v2/
user:
tengfei:
token: fake_token
yintengfei:
token: fake_token
cgc:
url: https://cgc-api.sbgenomics.com/
user:
tengfei:
token: fake_token
gcp:
url: https://gcp-api.sbgenomics.com/v2/
user:
tengfei:
token: fake_token
When you load sevenbridges package, it will first try to parse your token configuration file first into an options list.
## Create Auth object from config file
a <- Auth(username = "yintengfei", platform = "us")
## show all
getToken()
## show all pre-set user token for platform
getToken("cgc")
## show individual token for a user
getToken(platform = "cgc", username = "tengfei")
Note: when you edit your .sbg.auth.yml, you have to reload your package.
First thing first, you need to construct an Auth object, everything begins with this object, it stores
The logic is like this
library(sevenbridges)
## direct setup
a <- Auth(token = "1c0e6e202b544030870ccc147092c257",
url = "https://cgc-api.sbgenomics.com/v2/")
By default it points to Cancer Genomics Cloud platform, unless you specify
Note: when you construct the Auth object, make sure you input the correct platform or API url for your authentication. On Seven Bridges related platforms, you can always find it under your account setting and developer tab.
For the tutorial about how to get your authentication, please check
If we didn’t pass any parameters to api() from Auth, it will list all API calls, and anything parameter we provided will pass on to api() function, but you don’t need to input token and url again! The Auth object will know that information already.
And this call from Auth object will check the response too.
a$api()
offset
specify where it is started, and limit
specify how many you want to show from there (max: 100). Because the item could be thousands of files and apps, so by default the offset and limit is set to 0 and 100 accordingly.
getOption("sevenbridges")$offset
getOption("sevenbridges")$limit
Please pay attention to this
id
is most accurate and fast for any Item like Project, App, Task, File.complete = TRUE
if you want to search across everything, this might be slow.For example, to list all public apps, use visibility
argument, but make sure you pass complete = TRUE
to it, to show every single things. This arguments generally works for items like “App”, “Project”, “Task”, “File” etc
## first, search by id is fast
x <- a$app(visibility = "public", id = "djordje_klisic/public-apps-by-seven-bridges/sbg-ucsc-b37-bed-converter/0")
## show 100 items from public
x <- a$app(visibility = "public")
length(x) ## 100
x <- a$app(visibility = "public", complete = TRUE)
length(x) ## 211 by March, 2016
## this return nothing, because it's not in the first 100
a$app(visibility = "public", name = "bed converter")
## this return an app, because it pulls all apps and did serach.
a$app(visibility = "public", name = "bed converter", complete = TRUE)
This call returns information about your current rate limit. This is the number of API calls you can make in one hour.
a$rate_limit()
This call returns a list of the resources, such as projects, billing groups, and organizations, that are accessible to you. If you are not an administrator, this call will only return a successful response if {username} is replaced with your own username. If you are an administrator, you can replace {username} with the username of any CGC user, to return information on their resources.
Case sensitivity: Don’t forget to capitalize your username in the same way as you set it when you registered on the CGC.
If you don’t provide a username, your user information will be shown.
## return your information
a$user()
## return user 'tengfei''s information
a$user("tengfei")
if no id provided, This call returns a list of paths used to access billing information via the API. else, This call lists all your billing groups, including groups that are pending or have been disabled. if breakdown = TRUE
, This call returns a breakdown of spending per-project for the billing group specified by billing_group. For each project that the billing group is associated with, information is shown on the tasks run, including their initiating user (the runner), start and end times, and cost.
## return a BillingList object
(b <- a$billing())
a$billing(id = b$id, breakdown = TRUE)
If no id provided, This call returns a list of invoices, with information about each, including whether or not the invoice is pending and the billing period it covers. The call returns information about all your available invoices, unless you use the query parameter bg_id to specify the ID of a particular billing group, in which case it will return the invoice incurred by that billing group only. if id provided, This call retrieves information about a selected invoice, including the costs for analysis and storage, and the invoice period.
a$invoice()
a$invoice(id = "fake_id")
Note (TODO): Invoice is not an object yet, it currently just return a list.
Project is the basic unit to organize different entities: files, tasks, apps, etc. So lots actions comes from this `Project’ object.
This call returns a list of all projects you are a member of. Each project’s project_id and URL on the CGC will be returned.
a$project()
Then if you want to list the projects owned by and accessible to a particular user, specify the owner
argument. Each project’s ID and URL will be returned.
a$project(owner = "tengfei")
a$project(owner = "yintengfei")
To get details about project(s), use detail = TRUE
a$project(detail = TRUE)
For more friendly interface and convenient search, we support partial name match in this interface. The first argument for the call is “name”, users can provide part of the name and we do a search for you automatically.
## want to return a project called
a$project("hello")
To create a new project, user need to specify
a$project_new("api_testing_tcga", b$id,
description = "Test for API")
Just need to pass a “tags” list with value “tcga”
a$project_new("controlled_project", b$id,
description = "Test for API", tags = list("tcga"))
Next we delete what we created for testing, only single project could be deleted now by call $delete()
, so please pay attention to the returned object from a$project()
, sometimes if you are using partial matching by name, it will return a list. If you want to operate on a list of object, we provide some batch function, please read relevant section.
## remove it, not run
a$project("api_testing")$delete()
## check
## will delete all projects matcht the name
delete(a$project("api_testing_donnot_delete_me"))
You can update information about an existing project, including
a$project(id = "tengfei/helloworld")
a$project(id = "tengfei/helloworld")$update(name = "Hello World Update",
description = "Update description")
This call returns a list of the members of the specified project. For each member, the response lists:
a$project(id = "tengfei/demo-project")$member()
This call adds a new user to a specified project. It can only be successfully made by a user who has admin permissions in the project.
Requests to add a project member must include the key permissions. However, if you do not include a value for some permission, it will be set to false by default.
Set permission by passing: copy, write, execute, admin, read argument.
Note: read is implicit and set by default, you can not be project member without having read permission
m <- a$project(id = "tengfei/demo-project")$member_add(username = "yintengfei")
This call edits a user’s permissions in a specified project. It can only be successfully made by a user who has admin permissions in the project.
m <- a$project(id = "tengfei/demo-project")$
member(username = "yintengfei")
m$update(copy = TRUE)
== Member ==
username : yintengfei
-- Permission --
read : TRUE
write : FALSE
copy_permission : TRUE
execute : FALSE
admin : FALSE
To delete an existing member, just to call delete()
action on Member
object.
m$delete()
## confirm
a$project(id = "tengfei/demo-project")$member()
To list all files belongs to a project simple use
p <- a$project(id = "tengfei/demo-project")
p$file()
From now on we are going to have fun with Apps! The CWL(Common Workflow Language) based approach. It gets more and more popular and really designed for reproducible pipeline description and execution. All Seven Bridges platforms support cwl naively in the cloud. So in this section, I will introduce how we are going to do this via API and inside R.
This call lists all the apps available to you.
a$app()
## or show details
a$app(detail = TRUE)
To search a name, please pass a pattern for the name
argument; or provide a unique id
.
## pattern match
a$app(name = "STAR")
## unique id
aid <- a$app()[[1]]$id
aid
a$app(id = aid)
## get a specific revision from an app
a$app(id = aid, revision = 0)
To list all apps belong to one project use project
argument
## my favorite, always
a$project("demo")$app()
## or alternatviely
pid <- a$project("demo")$id
a$app(project = pid)
To list all public apps, use visibility
argument
## show 100 items from public
x = a$app(visibility = "public")
length(x)
x = a$app(visibility = "public", complete = TRUE)
length(x)
x = a$app(project = "tengfei/helloworld", complete = TRUE)
length(x)
a$app(visibility = "public", limit = 5, offset = 150)
To search an app cross all published apps (this may take a while)
a$app("STAR", visibility = "public", complete = TRUE)
This call copies the specified app to the specified project. The app should be one in a project that you can access; this could be an app that has been uploaded to the CGC by a project member, or a publicly available app that has been copied to the project.
Need two arguments
aid <- a$public_app()[[1]]$id
a$copy_app(aid, project = pid, name = "copy-rename-test")
## check it is copied
a$app(project = pid)
Or you can copy directly from an app object
app = a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star")
app$copy_to(project = "tengfei/api-testing",
name = "copy of star")
This call returns information about the specified app, as raw CWL. The call differs from the call to GET details of an app by returning a JSON object that is the CWL.
The app should be one in a project that you can access; this could be an app that has been uploaded to the CGC by a project member, or a publicly available app that has been copied to the project.
To get a specific revision, pass revision
argument.
ap <- a$app(visibility = "public")[[1]]
a$project("demo")$app("index")
## get a specific revision
a$project("demo")$app("index", revision = 0)
TODO: convert it to an CWL object
Use app_add
function call from a Project
object, two parameters required
cwl.fl <- system.file("extdata", "bam_index.json", package = "sevenbridges")
a$project("demo")$app_add(short_name = "new_bam_index_app", filename = cwl.fl)
a$project("demo")$app_add(short_name = "new_bam_index_app", revision = 2, filename = cwl.fl)
Note: provide the same short_name will add new revision
This is fun and is introduced in another vignette.
This call returns a list of tasks that you can access. You are able to filter by status
## all tasks
a$task()
## filter
a$task(status = "completed")
a$task(status = "running")
To list all tasks in a project
## better way
a$project("demo")$task()
## alternatively
pid <- a$project("demo")$id
a$task(project = pid)
To list all tasks with details just pass detail = TRUE
.
p$task(id = "your task id here", detail = TRUE)
p$task(detail = TRUE)
To list a batch task using parent
parameter, pass the batch parent task id.
p = a$project(id = "tengfei/demo")
p$task(id = "2e1ebed1-c53e-4373-870d-4732acacbbbb")
p$task(parent = "2e1ebed1-c53e-4373-870d-4732acacbbbb")
p$task(parent = "2e1ebed1-c53e-4373-870d-4732acacbbbb", status = "completed")
p$task(parent = "2e1ebed1-c53e-4373-870d-4732acacbbbb", status = "draft")
To create a draft, you need to call the task_add
function from Project object. And you need to pass following arguments
## push an app first
fl.runif <- system.file("extdata", "runif.json", package = "sbgr")
a$project("demo")$app_add("runif_draft", fl.runif)
runif_id <- "tengfei/demo-project/runif_draft"
## create a draft task
a$project("demo")$task_add(name = "Draft runif 3",
description = "Description for runif 3",
app = runif_id,
inputs = list(min = 1, max = 10))
## confirm
a$project("demo")$task(status = "draft")
Call update
function fro a Task object, you can update
## get the single task you want to update
tsk <- a$project("demo")$task("Draft runif 3")
tsk
tsk$update(name = "Draft runif update", description = "draft 2",
inputs = list(max = 100))
## alternative way to check all inputs
tsk$getInputs()
This call runs (executes) the specified task. Only tasks whose status is “DRAFT” may be run.
tsk$run()
## run update without information just return latest information
tsk$update()
To monitor a running task, call monitor
from a task object
tsk$monitor()
get and set default hook function for task status, currently failed, completed tasks will break the monitoring.
Note: Hook function has to return TRUE
(break monitoring) or FALSE
(continuing) in the end.
getTaskHook("completed")
getTaskHook("draft")
setTaskHook("draft", function(){message("never happens"); return(TRUE)})
getTaskHook("draft")
This call aborts the specified task. Only tasks whose status is “RUNNING” may be aborted.
## abort
tsk$abort()
## check
tsk$update()
Note that you can only delete draft tasks, not running tasks.
tsklst <- a$task(status = "draft")
## delete a single task
tsklst[[1]]$delete()
## confirm
a$task(status = "draft")
## delete a list of tasks
delete(tsklst)
tsk$download("~/Downloads")
To run task in batch mode, (check ?batch
) for more details, here is an mock running
## batch by items
(tsk <- p$task_add(name = "RNA DE report new batch 2",
description = "RNA DE analysis report",
app = rna.app$id,
batch = batch(input = "bamfiles"),
inputs = list(bamfiles = bamfiles.in,
design = design.in,
gtffile = gtf.in)))
## batch by metadata, input files has to have metadata fields specified
(tsk <- p$task_add(name = "RNA DE report new batch 3",
description = "RNA DE analysis report",
app = rna.app$id,
batch = batch(input = "fastq",
c("metadata.sample_id", "metadata.library_id")),
inputs = list(bamfiles = bamfiles.in,
design = design.in,
gtffile = gtf.in)))
a = Auth(user = "tengfei", platform = "us")
a$add_volume(name = "tutorial_volume",
type = "s3",
bucket = "tengfei-demo",
prefix = "",
access_key_id = "AKIAJQENSIA4DJQNZO3A",
secret_access_key = "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",
sse_algorithm = "AES256",
access_mode = "RW")
## list all volume
a$volume()
## get unique volume by id
a$volume(id = "tengfei/tengfei_demo")
## partial search by name
a$volume(name = "demo")
v = a$volume()
v[[1]]$detail()
a$volume(id = "tengfei/tengfei_demo")$delete()
This call import a file from volume like s3 bucket to your project.
v = a$volume(id = "tengfei/tutorial_volume")
res = v$import(location = "A-RNA-File.bam.bai",
project = "tengfei/s3tutorial",
name = "new.bam.bai",
overwrite = TRUE)
## get job status update
## state will be "COMPLETED" when it's finished other wise "PENDING"
v$get_import_job(res$id)
v
Important :
When test please update your file to a project.
res = v$export(file = "579fb1c9e4b08370afe7903a",
volume = "tengfei/tutorial_volume",
location = "", ## when "" use old name
sse_algorithm = "AES256")
## get job status update
## state will be "COMPLETED" when it's finished other wise "PENDING"
v$get_export_job(res$id)
v
How to get public content for files and apps? So you can get their ids and copy them to your project? In this package, we provide two easy function calls from Authentification object.
When you search and get what you want, you can use their id to do more operation like copying to a project.
## list first 100 files
a$public_file()
## list by offset and limit
a$public_file(offset = 100, limit = 100)
## simply list everything!
a$public_file(complete = TRUE)
## get exact file by id
a$public_file(id = "5772b6f0507c175267448700")
## get exact file by name with exact = TRUE
a$public_file(name = "G20479.HCC1143.2.converted.pe_1_1Mreads.fastq", exact = TRUE)
## with exact = FALSE by default search by name pattern
a$public_file(name = "fastq")
a$public_file(name = "G20479.HCC1143.2.converted.pe_1_1Mreads.fastq")
Actually the public files are hosted in the project called “admin/sbg-public-data”, so of course you can just use file
api to get files you need.
For public apps we have the similar API calls.
## list for 100 apps
a$public_app()
## list by offset and limit
a$public_app(offset = 100, limit = 50)
## search by id
a$public_app(id = "admin/sbg-public-data/control-freec-8-1/12")
## search by name in ALL apps
a$public_app(name = "STAR", complete = TRUE)
## search by name with exact match
a$public_app(name = "Control-FREEC", exact = TRUE, complete = TRUE)
In easy API, we return an object which contains the raw response from httr as a field, you can either call response()
on that object or just get the field out of it
Right now, users have to use lapply
to do those operations themselves. It’s simple implementation.
In this package, we implement delete
and download
for some object like task and project or file.
Quick cheat sheet (in progress)
## Authentification
getToken()
a <- Auth(token = token)
a <- Auth(token = token,
url = "https://cgc-api.sbgenomics.com/v2/")
a <- Auth(platform = "us", username = "tengfei")
## list API
a$api()
## Rate limits
a$rate_limit()
## Users
a$user()
a$user("tengfei")
## billing
a$billing()
a$billing(id = , breakdown = TRUE)
a$invoice()
a$invoice(id = "fake_id")
## Project
### create new project
a$project_new(name = , billing_group_id = , description = )
### list all project owned by you
a$project()
a$project(owner = "yintengfei")
### partial match
p <- a$project(name = , id = , exact = TRUE)
### delete
p$delete()
### update
p$update(name = , description = )
### members
p$member()
p$member_add(username = )
p$member(username = )$update(write = , copy = , execute = )
p$memeber(usrname = )$delete()
## file
### list all files in this project
p$file()
### list all public files
a$file(visibility = "public")
### copy
a$copyFile(c(fid, fid2), project = pid)
### delete
p$file(id = fid)$delete()
### download
p$file()[[1]]$download_url()
p$file(id = fid3)$download("~/Downloads/")
### download all
download(p$file())
### update a file
fl$update(name = , metadata = list(a = ,b = , ...))
### meta
fl$meta()
fl$setMeta()
fl$setMeta(..., overwrite = TRUE)
## App
a$app()
### apps in a project
p$app()
p$app(name, id, revision = )
a$copyApp(aid, project = pid, name = )
### add
p$app_add(short_name = , filename =)
## Task
a$task()
a$task(name = , id = )
a$task(status = )
p$task()
p$task(name = , id = )
p$task(status = )
tsk <- p$task(name = , id = )
tsk$update()
tsk$abort()
tsk$run()
tsk$download()
tsk$detele()
tsk$getInputs()
tsk$monitor()
getTaskHook()
setTaskHook(statis = , fun =)