Workflow definition language

Workbench Workflow Definition Language (WWDL) uses JSON and follows the schema outlined in this guide. The workflow is parsed at runtime by the WorkflowRunner class, prior to execution.

Workflow structure

Workflows are constructed as a sequence of stages. Each stage must run in sequence, however steps within a stage can run either independently of each other, in a loop or in batches (fan-out/fan-in).

Stages with a "when" condition will only run if the condition evaluates to True. The condition must follow Jinja syntax.

[
    {
        "name": "<stage-name>",
        "when": "{{<stage-condition>}}",
        "steps": [ <array-of-step-definitions> ],
    },
    {...}
]

Attribute	Required	Type	Description
`name`	Yes	String	Plain-language description of the workflow stage purpose.
`when`		Boolean	Jinja template that must evaluate to True for the stage to execute.
`steps`	Yes	Array	A list of steps in the stage that will be conditionally executed.

Steps

At present there are two types of step definitions: class-based and template-based. Both definition types possess the following common attributes:

{
    "name": "<step-name>",
    "when": "{{<stage-condition>}}",
    "depends_on": [ <array-of-step-names> ],
    "output": "<output-variable-name",
    "retries": <retry-limit>,
    "on_failure": "<step-failure-action>"
}

Attribute	Required	Type	Description
`name`	Yes	String	Plain-language description of the workflow stage purpose.
`when`		Boolean	Jinja template that must evaluate to True for the step to execute.
`depends_on`		Array or String	Step(s) that must complete before this one. Can be a single step reference (string) or multiple (array). Steps cannot depend on steps in another stage, as stages always run in sequence.
`output`		String	A key / variable name to use when storing results of the step in the workflow context dictionary. Steps with multiple outputs will be stored as a tuple at this variable name.
`retries`		Integer	Maximum number of unsuccessful attempts before terminating the step. Defaults to 0.
`on_failure`		String	If set to 'fail' then the workflow run will be forced to terminate if `retries` is exceeded. Else the step will silently fail and proceed to the next step. Defaults to 'fail'.

Class-based steps

Class-based steps (as the name implies) instantiate a class object and then (optionally) call a method of that class. Class-based steps have the following additional attributes:

{
    "class": "<workbench-class>",
    "instance_params": { "<workbench-class-keyword-arguments>" },
    "method": "get_item",
    "call_params": { <class-method-keyword arguments> }
}

Attribute	Required	Type	Description
`class`	Yes	String	The name of the Workbench class to instantiate. Must use dot notation to direct to the necessary Workbench sub-package, for example "bindings.DocumentVersion".
`instance_params`		Dictionary	A dictionary of key-value pairs denoting the keyword arguments to be passed to the class constructor. When referencing context variables, the variable key / name must be preceded by a "$".
`method`		String	The name of the class method to be called. Note: if no method is invoked and no `"output"` variable name is set then step execution will return nothing.
`call_params`		Dictionary	A dictionary of key-value pairs denoting the keyword arguments to be passed when calling the method. When referencing context variables, the variable key / name must be preceded by a "$".

Variable Reference Syntax

Context variables in instance_params and call_params support both dot notation and bracket notation:

Syntax	Example	Use Case
Dot notation	`$user.profile.name`	Standard attribute/key access
Bracket notation	`$metadata[field-1]`	Keys with special characters (hyphens, spaces)
Quoted bracket	`$metadata['Part-2']`	Explicit string keys (single or double quotes)
Variable bracket	`$metadata[$classifier_id]`	Dynamic key from another context variable
Mixed	`$session.properties[Part-2].code`	Combining styles

{
    "call_params": {
        "simple_ref": "$connection.session",
        "bracket_ref": "$metadata_spec[Part-2]",
        "quoted_ref": "$metadata_spec['Part-2']",
        "dynamic_ref": "$metadata_spec[$classifier_id]",
        "mixed_ref": "$session.metadata[field-name].value"
    }
}

Dynamic Keys

Use [$variable] syntax when the key itself comes from another context variable. For example, if classifier_id contains "Part-2", then $metadata[$classifier_id] resolves to $metadata["Part-2"].

Example: Generating a signed URL

This example uses the Workbench Azure Blob Storage client (which must be instantiated with organization, workspace and session arguments) to generate a signed URL to a blob.

{
    "name": "Get signed URL",
    "class": "clients.AzureBlobStorageClient",
    "dependsOn": [
        "Concatenate blob_name"
    ],
    "method": "get_signed_url",
    "instance_params": {
        "organization": "$organization",
        "workspace": "$workspace",
        "session_id": "$session_id"
    },
    "call_params": {
        "blob_name": "$blob_name"
    },
    "output": "signed_url"
}

Template-based steps

Template-based steps execute a Jinja template. This template can range in complexity anywhere from string concatenation through to method invocation. Template-based steps have the following additional attributes:

{
    "template": "{{<step-template>}}",
}

Attribute	Required	Type	Description
`template`	Yes	String	The Jinja template to evaluate.

Example: Initializing a session

This step will run only if a doc_version context variable (referencing an instance of the DocumentVersion class) is not already set. It will run after an earlier step titled 'Connect to session'.

{
    "name": "Get session documents",
    "when": "{{doc_version is not defined}}",
    "template": "{{session.initialize()}}",
    "dependsOn": [
        "Connect to session"
    ]
}

Iterative Stages

Stages can iterate over collections using the for_each attribute. This enables processing multiple items (such as rows in a spreadsheet) with a single stage definition.

Basic Iteration

{
  "name": "Process each sheet",
  "for_each": [
    { "over": "$document_version.sheets", "as": "current_sheet" }
  ],
  "steps": [
    {
      "name": "Analyze sheet",
      "class": "processors.analysers.SheetAnalyzer",
      "method": "run",
      "call_params": {
        "sheet": "$current_sheet"
      },
      "output": "analysis_result"
    }
  ]
}

Nested Iteration

Multiple iteration levels can be specified. The array order determines nesting (first is outer loop):

{
  "name": "Process all rows",
  "for_each": [
    { "over": "$document_version.sheets", "as": "current_sheet" },
    { "over": "$current_sheet.rows", "as": "current_row" }
  ],
  "steps": [
    {
      "name": "Process row",
      "class": "processors.RowProcessor",
      "method": "run",
      "call_params": {
        "sheet": "$current_sheet",
        "row": "$current_row"
      },
      "output": "row_result"
    }
  ]
}

Iteration Attributes

Attribute	Required	Type	Description
`over`	Yes	String	Reference to an iterable using `$variable` notation.
`as`	Yes	String	Variable name for the current item in each iteration.
`when`	No	String	Jinja condition to filter items. Items where condition is falsy are skipped.

Iteration Context

Within iterative steps, an $iteration object is available with metadata:

Property	Type	Description
`indices`	Tuple	Current position at each iteration level, e.g., `(0, 2)`
`depth`	Integer	Number of iteration levels
`aliases`	Array	Variable names for current items at each level

Example usage in a template:

{
  "name": "Log position",
  "template": "Processing item at position {{ iteration.indices }}"
}

Conditional Iteration

Filter items using when conditions at the iteration level:

{
  "for_each": [
    {
      "over": "$document_version.sheets",
      "as": "current_sheet",
      "when": "{{ current_sheet.row_count > 0 }}"
    }
  ]
}

When conditions can reference:

Global context variables
Outer loop aliases (for nested iterations)

Output Collection

Step outputs from iterative stages are collected into dictionaries keyed by index tuples:

# After stage completion, context contains:
context["row_result"] = {
    (0, 0): <result for sheet 0, row 0>,
    (0, 1): <result for sheet 0, row 1>,
    (1, 0): <result for sheet 1, row 0>,
    # ...
}

Subsequent stages can access these collected results using the output variable name.

Parallel Execution

Iterations execute in parallel using the runner's max_workers setting. Step dependencies (depends_on) are respected within each iteration independently - an iteration's Step B waits for that iteration's Step A, not for Step A from other iterations.