The Application Programming Interface (API) can be used to integrate CAVATICA with other applications, and to automate most procedures on it, such as uploading files, querying metadata, and executing analyses. The API uses the REST architectural style to read and write information about projects on CAVATICA.
You can also use our Python and R client libraries to integrate the CAVATICA API with your own applications.
https://cavatica-api.sbgenomics.com/v2
On this page
API paths
General API information
Identifying projects, users, apps, files, tasks and inputs
Authentication
Rate limits
Response pagination
API paths
The paths are structured into the following endpoints, which cover different categories of activity on the Platform:
General API information
Format
API requests are made over HTTP, and information is received and sent in JSON format. For this reason, you should set both the accept
and the content
header of the request to application/json
.
Responses also include Platform-specific error codes, in addition to standard HTTP codes. Information about each code is available on the page API status codes.
Generic query parameters
All API calls take the optional query parameter fields
. This can be set to any of the top-level items listed in the response schema to restrict the response to that information only. For example, GET /v2/projects/john_doe/project1?fields=id,name
will return the information about the resource project1
restricted to the fields id
and name
.
Identifying projects, users, apps, files, tasks and inputs
Project short names
Projects on CAVATICA have both given names, which you will see in visual interfaces, like the Projects drop-down menu on the visual interface, and short names, which are human-readable IDs derived from the given names. To refer to a project in an API call, you should use its short name.
Project short names are based on the name you give to a project when you create it. The short name is derived from the project name by:
- Formatting the name in lower case
- Omitting characters that are not letters, numbers, spaces or underscores
- Replacing spaces with hyphens
- Replacing underscores with hyphens
- Adding
_1
to any name that is already assigned to one of your projects.
For example, if I name my project 'RFranklin's experiments', it would be automatically assigned the shortname 'rfranklins-experiments'.
You can optionally override an auto-assigned short names to one of your choice, when you create a project. However, once the project has been created, its short name will be immutable. To create your own project short name, first create a project, using the drop-down menu at the top of the screen. Then, click the pencil icon on the Create a project pop-out window.
Users
CAVATICA users are referred to in the API by their usernames. These are chosen by the user when signing up for the Platform. Usernames are unique and immutable. They are also case sensitive, so make sure you have the right username capitalisation when using the API.
Uniqueness of project names
Every project is uniquely identified by
{project_owner_username}/{shortname}
.
Apps
Apps (tools and workflows) in projects can be accessed using the API. Like projects, apps have both given names, which are assigned by the users who create them, and short names. An app's short name is derived by the same process as a project's short name.
Each app is identified with reference to the project it is contained in and its short name, using the format: {project_owner}/{project}/{app_short_name}/{revision_number}
.
For instance, RFranklin/my-project/bamtools-merge-2-4-0/0
identifies an app.
Tasks
Tasks are referred to in the API calls by IDs. These are hexadecimal strings (UUIDs) assigned to tasks. You can retrieve them by making the API call to list tasks.
Tasks have the following statuses: DRAFT
, RUNNING
, QUEUED
, ABORTED
, COMPLETED
or FAILED
.
Files
Files are referred to in API calls by IDs. These are hexadecimal strings assigned to files. You can retrieve them by making the API call to list files.
Note that file IDs are dependent on the project the file is stored in. If you copy a file to a different project, it will have a new ID in this project.
In calls that return CWL descriptions of tasks, such as the call to GET
task details, files are identified by their path
objects. The file path
is identical to the file ID.
Inputs
Task inputs are specified as dictionaries. They pair apps to be executed in the task with the objects that will be inputted to them.
The format for an input is:
{app_id}: {object}
The {app_id}
is defined above. The value of {object}
is obtained as follows:
If the object to be inputted to the task is not a file (but an integer, boolean, etc) then simply enter that value as {object}
.
If the object to be inputted to the task is a file, then {object}
is a dictionary, with the format:
{
"class": "File",
"path": "file_id",
"name": "file_name.ext"
}
When multiple files are used as inputs, enter a list of {object}
s, like this:
[
{
"class": "File",
"path": "file_id",
"name": "file_name.ext"
}
{
"class": "File",
"path": "file_id",
"name": "file_name.ext"
}
]
The following are all examples of inputs:
- An input integer:
"Offset": {2}
- An input file for the known indels:
{
"cuffdiff_zip": {
"class": "File",
"path": "567890abc9b0307bc0414164",
"name": "example_human_known_indels.vcf"
}
}
3: File inputs for a Whole Exome Sequencing workflow, in the form of FASTQ reads:
"Reads_FASTQ": [
{
"class": "File",
"path": "567890abc1e5339df0414123",
"name": "WES_human_Illumina.pe_1.fastq"
},
{
"class": "File",
"path": "567890abc4f3066bc3750174",
"name": "WES_human_Illumina.pe_2.fastq"
}
]
Task inputs
For more examples of task
inputs
, use the call to get task inputs for some of the tasks you initiate on the CAVATICA visual interface.
For finding which app receives which inputs and their format, you can review the app's page on the CAVATICA visual interface. For example Whole Exome Sequencing GATK 2.3.9.-lite
Authentication
You will need an authentication token from the Developer Dashboard to uniquely identify yourself to the Platform.
Click here to go to the developer dashboard.
All API requests must have the HTTP header X-SBG-Auth-Token
which you should set to your authentication token. The only call which is exempt from this is the '/' call to list all request paths.
Rate limits
All API calls are rate-limited, which means that you can only perform a limited number of requests hourly. All rate limit information is returned to the user in the following HTTP headers:
- The header
X-RateLimit-Limit
represents the rate limit - currently this is 1000 requests per five minutes. - The header
X-RateLimit-Remaining
represents your remaining number of calls before hitting the limit. - The header
X-RateLimit-Reset
- represents the time in Unix timestamp when the limit will be reset
Response pagination
All API calls take the pagination query parameters limit
and offset
to control the number of items returned in a response. These are useful if you are returning information about a resource with many items, such as a list of many files in a project.
Filtering
In addition to controlling the number of items returned using the pagination query parameters, if you are requesting information about files using the call to
GET /files
you can filter items returned by filename, metadata, or originating task.
Specify the number of items to return in a response
You can control how many items are returned by an API call using the query parameter limit
. If you do not specify a value for limit
in a call, a maximum of 50 items will be returned by the call by default.
The maximum value for the query parameter limit
is 100.
Example 1:
Suppose you have 70 files in the project my-project
, and you issue the call to GET /files
as follows:
GET /v2/files?project=my-project HTTP/1.1
Host: cavatica-api.sbgenomics.com
X-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74
Since no value for limit
was specified, this call will return details of 50 of the files, along with a URL to return the next 20.
Example 2:
Again, suppose you have a project my-project
with 70 files in it. The following call will return details of all 70 files"
GET /v2/files?project=my-project?limit=70 HTTP/1.1
Host: cavatica-api.sbgenomics.com
X-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74
Specify the starting point for items to return in a response
You can control the starting point at which to start returning items in an API call using the query parameter offset
. If you do not specify a value for offset
then the default starting point will be the first item in the specified resource.
Example 1:
Suppose you have a project called my-project
containing 70 files, and you want to return their details, starting with the 30th file. To do this, issue the call to GET /files
with a query parameter offset
specified as follows:
GET /v2/files?project=my-project?offset=30 HTTP/1.1
Host: cavatica-api.sbgenomics.com
X-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74
Calls made with the offset
query parameter additionally return the header X-Total-Matching-Query
which signifies the total number of results.
Example 2:
An example of a call made using both pagination parameters is as follows:
GET v2/projects?limit=2&offset=2 HTTP/1.1
Host: cavatica-api.sbgenomics.com
X-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74
This returns the following body in JSON:
{
"href": "https://cavatica-api.sbgenomics.com/v2/projects/",
"items": [
{
"href": "https://cavatica-api.sbgenomics.com/v2/projects/john_doe/project1",
"id": "john_doe/project1",
"name": "project1"
},
{
"href": "https://cavatica-api.sbgenomics.com/v2/projects/john_doe/project2",
"id": "john_doe/project2",
"name": "Project 2"
}
],
"links": [
{
"href": "http://cavatica-api.sbgenomics.com/v2/projects/?offset=4?limit=2",
"rel": "next",
"method": "GET"
}
]
}
The headers returned include X-Total-Matching-Query
which lists the total number of results.
The body of the response includes the array links
, which indicate how to get the next or previous set of results.