Basic workflows
Defining workflows
Workbench offers basic constructor classes for chaining together processing units and testing or debugging simple workflows:
from workbench.workflows import Workflow, Stage, Job
workflow = Workflow("Workflow Example")
workflow.add_stage(Stage("Extract"))
workflow.stages["Extract"].add_job(BasicMetadataExtraction("Extract metadata"))
workflow.add_stage(Stage("Analyze"))
workflow.stages['Analyze'].add_job(StandardAnalysis('Execute IM standard'))
Running workflows
The choice of execution method will depend on whether you need to run the entire workflow, a stage, or an individual job.
# Run all workflow stages
for doc_version in session.document_versions:
# Run all workflow stages
workflow.run_all_stages_sequentially(doc_version, session)
# Run all jobs in a specific stage
workflow.stages['Extract'].run_all_jobs_sequentially(doc_version, session)
# Run a specific job
workflow.stages['Analyze'].jobs['Execute IM standard'].run(doc_version, session)
Example: Running a job task for a single document
When running a job on its own it's possible to pass keyword arguments to the run() method. This is not possible when calling all jobs in a stage or all stages in a workflow as the keyword arguments will be different.
classifiers = {}
for classifier in session.classifiers.values():
if classifier['id'] == 'ISO1':
classifiers['ISO1'] = classifier
doc_version = session.document_versions[1]
# The job can be run using the job.run(*args, **kwargs) method or called from the workflow object.
workflow.stages['Analyze'].run_job('Execute IM standard', doc_version, session, classifiers=classifiers)