Task queueing on Cavatica

Task queueing

There are several cases in which a task can be temporarily queued:

  1. The task has been just submitted and is awaiting execution.
  2. The maximum number of parallel instances for your account has been reached.
  3. Some of the cloud infrastructure resources required for task execution are not available.

In cases 2 and 3 above, task status changes back from QUEUED to RUNNING when the required parallel instances or cloud infrastructure resources become available. This change of task status can happen several times during execution.

When a task is queued due to reaching the maximum allowed number of parallel instances per user account, the time required for the task to change its state from QUEUED back to RUNNING can depend on several factors such as:

  • size of input files,
  • time it takes for the tool or workflow to execute,
  • availability of instances - e.g. whether the required instance type is available immediately.

To ensure that all users can run their tasks on Cavatica, individual users have the following parallel instance limits:

  • Individual CHOP users can use up to 125 parallel instances.
  • Individual non-CHOP users can use up to 80 parallel instances.

This instance limit is implemented because the number of parallel instances used in total by Cavatica is limited by Amazon Web Services (AWS). Even though this might mean longer execution time for tasks requiring more parallel instances than the maximum allowed number, it ensures that instances are available for all Cavatica users to run their tasks. Also, please note that task queueing does not incur extra charges.

The limit is applied as the cumulative maximum number of parallel instances per user, for all tasks in all projects created by the user. To understand how the limit works, please consider the following example:

  1. CHOP user rfranklin has two projects on Cavatica, named WGS and WES.
  2. In WES, rfranklin is currently running a batch task that is using 86 parallel instances.
  3. In WGS, rfranklin starts another batch task that requires 52 parallel instances. As the limit of 125 parallel instances is applied per CHOP user, this means that the task in WGS will be able to use only 39 instances (125 minus the 86 instances used by the task in WES), while the remaining instances are allocated as either of the two running tasks releases them. In practice, this means that the first task in queue will start when a running task has been completed.

Users who are added to a project also run their tasks within the project creator’s parallel instance limit. In the example above, if rfranklin finally adds user mwilkins to one of the projects, WES or WGS, and mwilkins tries to run a task, this task will be queued as rfranklin is already using the maximum allowed number of parallel instances.

For more general information about different task states, please refer to the list of task statuses.