Skip to content

Workflow definition language

Workbench Workflow Definition Language (WWDL) uses JSON and follows the schema outlined in this guide. The workflow is parsed at runtime by the WorkflowRunner class, prior to execution.

Workflow structure

Workflows are constructed as a sequence of stages. Each stage must run in sequence, however steps within a stage can run either independently of each other, in a loop or in batches (fan-out/fan-in).

Stages with a "when" condition will only run if the condition evaluates to True. The condition must follow Jinja syntax.

[
    {
        "name": "<stage-name>",
        "when": "{{<stage-condition>}}",
        "steps": [ <array-of-step-definitions> ],
    },
    {...}
]
Attribute Required Type Description
name Yes String Plain-language description of the workflow stage purpose.
when Boolean Jinja template that must evaluate to True for the stage to execute.
steps Yes Array A list of steps in the stage that will be conditionally executed.

Steps

At present there are two types of step definitions: class-based and template-based. Both definition types possess the following common attributes:

{
    "name": "<step-name>",
    "when": "{{<stage-condition>}}",
    "depends_on": [ <array-of-step-names> ],
    "output": "<output-variable-name",
    "retries": <retry-limit>,
    "on_failure": "<step-failure-action>"
}
Attribute Required Type Description
name Yes String Plain-language description of the workflow stage purpose.
when Boolean Jinja template that must evaluate to True for the step to execute.
depends_on Array or String Step(s) that must complete before this one. Can be a single step reference (string) or multiple (array). Steps cannot depend on steps in another stage, as stages always run in sequence.
output String A key / variable name to use when storing results of the step in the workflow context dictionary. Steps with multiple outputs will be stored as a tuple at this variable name.
retries Integer Maximum number of unsuccessful attempts before terminating the step. Defaults to 0.
on_failure String If set to 'fail' then the workflow run will be forced to terminate if retries is exceeded. Else the step will silently fail and proceed to the next step. Defaults to 'fail'.

Class-based steps

Class-based steps (as the name implies) instantiate a class object and then (optionally) call a method of that class. Class-based steps have the following additional attributes:

{
    "class": "<workbench-class>",
    "instance_params": { "<workbench-class-keyword-arguments>" },
    "method": "get_item",
    "call_params": { <class-method-keyword arguments> }
}
Attribute Required Type Description
class Yes String The name of the Workbench class to instantiate. Must use dot notation to direct to the necessary Workbench sub-package, for example "bindings.DocumentVersion".
instance_params Dictionary A dictionary of key-value pairs denoting the keyword arguments to be passed to the class constructor. When referencing context variables, the variable key / name must be preceded by a "$".
method String The name of the class method to be called. Note: if no method is invoked and no "output" variable name is set then step execution will return nothing.
call_params Dictionary A dictionary of key-value pairs denoting the keyword arguments to be passed when calling the method. When referencing context variables, the variable key / name must be preceded by a "$".

Variable Reference Syntax

Context variables in instance_params and call_params support both dot notation and bracket notation:

Syntax Example Use Case
Dot notation $user.profile.name Standard attribute/key access
Bracket notation $metadata[field-1] Keys with special characters (hyphens, spaces)
Quoted bracket $metadata['Part-2'] Explicit string keys (single or double quotes)
Variable bracket $metadata[$classifier_id] Dynamic key from another context variable
Mixed $session.properties[Part-2].code Combining styles
{
    "call_params": {
        "simple_ref": "$connection.session",
        "bracket_ref": "$metadata_spec[Part-2]",
        "quoted_ref": "$metadata_spec['Part-2']",
        "dynamic_ref": "$metadata_spec[$classifier_id]",
        "mixed_ref": "$session.metadata[field-name].value"
    }
}

Dynamic Keys

Use [$variable] syntax when the key itself comes from another context variable. For example, if classifier_id contains "Part-2", then $metadata[$classifier_id] resolves to $metadata["Part-2"].

Example: Generating a signed URL

This example uses the Workbench Azure Blob Storage client (which must be instantiated with organization, workspace and session arguments) to generate a signed URL to a blob.

{
    "name": "Get signed URL",
    "class": "clients.AzureBlobStorageClient",
    "dependsOn": [
        "Concatenate blob_name"
    ],
    "method": "get_signed_url",
    "instance_params": {
        "organization": "$organization",
        "workspace": "$workspace",
        "session_id": "$session_id"
    },
    "call_params": {
        "blob_name": "$blob_name"
    },
    "output": "signed_url"
}

Template-based steps

Template-based steps execute a Jinja template. This template can range in complexity anywhere from string concatenation through to method invocation. Template-based steps have the following additional attributes:

{
    "template": "{{<step-template>}}",
}
Attribute Required Type Description
template Yes String The Jinja template to evaluate.

Example: Initializing a session

This step will run only if a doc_version context variable (referencing an instance of the DocumentVersion class) is not already set. It will run after an earlier step titled 'Connect to session'.

{
    "name": "Get session documents",
    "when": "{{doc_version is not defined}}",
    "template": "{{session.initialize()}}",
    "dependsOn": [
        "Connect to session"
    ]
}

Iterative Stages

Stages can iterate over collections using the for_each attribute. This enables processing multiple items (such as rows in a spreadsheet) with a single stage definition.

Basic Iteration

{
  "name": "Process each sheet",
  "for_each": [
    { "over": "$document_version.sheets", "as": "current_sheet" }
  ],
  "steps": [
    {
      "name": "Analyze sheet",
      "class": "processors.analysers.SheetAnalyzer",
      "method": "run",
      "call_params": {
        "sheet": "$current_sheet"
      },
      "output": "analysis_result"
    }
  ]
}

Nested Iteration

Multiple iteration levels can be specified. The array order determines nesting (first is outer loop):

{
  "name": "Process all rows",
  "for_each": [
    { "over": "$document_version.sheets", "as": "current_sheet" },
    { "over": "$current_sheet.rows", "as": "current_row" }
  ],
  "steps": [
    {
      "name": "Process row",
      "class": "processors.RowProcessor",
      "method": "run",
      "call_params": {
        "sheet": "$current_sheet",
        "row": "$current_row"
      },
      "output": "row_result"
    }
  ]
}

Iteration Attributes

Attribute Required Type Description
over Yes String Reference to an iterable using $variable notation.
as Yes String Variable name for the current item in each iteration.
when No String Jinja condition to filter items. Items where condition is falsy are skipped.

Iteration Context

Within iterative steps, an $iteration object is available with metadata:

Property Type Description
indices Tuple Current position at each iteration level, e.g., (0, 2)
depth Integer Number of iteration levels
aliases Array Variable names for current items at each level

Example usage in a template:

{
  "name": "Log position",
  "template": "Processing item at position {{ iteration.indices }}"
}

Conditional Iteration

Filter items using when conditions at the iteration level:

{
  "for_each": [
    {
      "over": "$document_version.sheets",
      "as": "current_sheet",
      "when": "{{ current_sheet.row_count > 0 }}"
    }
  ]
}

When conditions can reference:

  • Global context variables
  • Outer loop aliases (for nested iterations)

Output Collection

Step outputs from iterative stages are collected into dictionaries keyed by index tuples:

# After stage completion, context contains:
context["row_result"] = {
    (0, 0): <result for sheet 0, row 0>,
    (0, 1): <result for sheet 0, row 1>,
    (1, 0): <result for sheet 1, row 0>,
    # ...
}

Subsequent stages can access these collected results using the output variable name.

Parallel Execution

Iterations execute in parallel using the runner's max_workers setting. Step dependencies (depends_on) are respected within each iteration independently - an iteration's Step B waits for that iteration's Step A, not for Step A from other iterations.