Error Handling in Workflows
Robust error handling is essential for production workflows. Mastra provides several mechanisms to handle errors gracefully, allowing your workflows to recover from failures or gracefully degrade when necessary.
Overview
Error handling in Mastra workflows can be implemented using:
- Step Retries - Automatically retry failed steps
- Conditional Branching - Create alternative paths based on step success or failure
- Error Monitoring - Watch workflows for errors and handle them programmatically
- Result Status Checks - Check the status of previous steps in subsequent steps
Step Retries
Mastra provides a built-in retry mechanism for steps that fail due to transient errors. This is particularly useful for steps that interact with external services or resources that might experience temporary unavailability.
Basic Retry Configuration
You can configure retries at the workflow level or for individual steps:
// Workflow-level retry configuration
const workflow = new Workflow({
name: 'my-workflow',
retryConfig: {
attempts: 3, // Number of retry attempts
delay: 1000, // Delay between retries in milliseconds
},
});
// Step-level retry configuration (overrides workflow-level)
const apiStep = new Step({
id: 'callApi',
execute: async () => {
// API call that might fail
},
retryConfig: {
attempts: 5, // This step will retry up to 5 times
delay: 2000, // With a 2-second delay between retries
},
});
For more details about step retries, see the Step Retries reference.
Conditional Branching
You can create alternative workflow paths based on the success or failure of previous steps using conditional logic:
// Create a workflow with conditional branching
const workflow = new Workflow({
name: 'error-handling-workflow',
});
workflow
.step(fetchDataStep)
.then(processDataStep, {
// Only execute processDataStep if fetchDataStep was successful
when: ({ context }) => {
return context.steps.fetchDataStep?.status === 'success';
},
})
.then(fallbackStep, {
// Execute fallbackStep if fetchDataStep failed
when: ({ context }) => {
return context.steps.fetchDataStep?.status === 'failed';
},
})
.commit();
Error Monitoring
You can monitor workflows for errors using the watch
method:
const { start, watch } = workflow.createRun();
watch(async ({ context, activePaths }) => {
// Check for any failed steps
const failedSteps = Object.entries(context.steps)
.filter(([_, step]) => step.status === 'failed')
.map(([stepId]) => stepId);
if (failedSteps.length > 0) {
console.error(`Workflow has failed steps: ${failedSteps.join(', ')}`);
// Take remedial action, such as alerting or logging
}
});
await start();
Handling Errors in Steps
Within a step’s execution function, you can handle errors programmatically:
const robustStep = new Step({
id: 'robustStep',
execute: async ({ context }) => {
try {
// Attempt the primary operation
const result = await someRiskyOperation();
return { success: true, data: result };
} catch (error) {
// Log the error
console.error('Operation failed:', error);
// Return a graceful fallback result instead of throwing
return {
success: false,
error: error.message,
fallbackData: 'Default value'
};
}
},
});
Checking Previous Step Results
You can make decisions based on the results of previous steps:
const finalStep = new Step({
id: 'finalStep',
execute: async ({ context }) => {
// Check results of previous steps
const step1Success = context.steps.step1?.status === 'success';
const step2Success = context.steps.step2?.status === 'success';
if (step1Success && step2Success) {
// All steps succeeded
return { status: 'complete', result: 'All operations succeeded' };
} else if (step1Success) {
// Only step1 succeeded
return { status: 'partial', result: 'Partial completion' };
} else {
// Critical failure
return { status: 'failed', result: 'Critical steps failed' };
}
},
});
Best Practices for Error Handling
-
Use retries for transient failures: Configure retry policies for steps that might experience temporary issues.
-
Provide fallback paths: Design workflows with alternative paths for when critical steps fail.
-
Be specific about error scenarios: Use different handling strategies for different types of errors.
-
Log errors comprehensively: Include context information when logging errors to aid in debugging.
-
Return meaningful data on failure: When a step fails, return structured data about the failure to help downstream steps make decisions.
-
Consider idempotency: Ensure steps can be safely retried without causing duplicate side effects.
-
Monitor workflow execution: Use the
watch
method to actively monitor workflow execution and detect errors early.
Advanced Error Handling
For more complex error handling scenarios, consider:
- Implementing circuit breakers: If a step fails repeatedly, stop retrying and use a fallback strategy
- Adding timeout handling: Set time limits for steps to prevent workflows from hanging indefinitely
- Creating dedicated error recovery workflows: For critical workflows, create separate recovery workflows that can be triggered when the main workflow fails