Quick Start
How It Works
| Component | Purpose |
|---|---|
| max_retries | Maximum number of retry attempts |
| retry_delay | Initial delay between retries (seconds) |
| max_retry_delay | Cap on exponential backoff delay |
| retry_count | Current retry attempt counter |
Retry Configuration
Task-Level Settings
Each task can configure its own retry behavior:| Field | Type | Default | Description |
|---|---|---|---|
max_retries | int | 3 | Maximum retry attempts |
retry_delay | float | 1.0 | Initial delay in seconds |
max_retry_delay | float | 300.0 | Maximum delay cap (5 minutes) |
retry_count | int | 0 | Current retry counter (read-only) |
Exponential Backoff Formula
Retry delays follow exponential backoff with a maximum cap:retry_delay=2.0, max_retry_delay=60:
- Retry 1:
min(2 * 2^0, 60) = 2 seconds - Retry 2:
min(2 * 2^1, 60) = 4 seconds - Retry 3:
min(2 * 2^2, 60) = 8 seconds - Retry 4:
min(2 * 2^3, 60) = 16 seconds - Retry 5:
min(2 * 2^4, 60) = 32 seconds - Retry 6:
min(2 * 2^5, 60) = 60 seconds(capped)
Failed Task Propagation
When a task fails after exhausting retries, dependent tasks are automatically skipped:Failure Propagation Flow
Retry Decision Logic
The orchestrator follows this logic for each task execution:Common Patterns
Fast Retry for Transient Errors
Slow Retry for Rate Limits
No Retries for One-Shot Operations
Chain with Failure Isolation
Migration from Hardcoded Delays
Before: All tasks used 1-second hardcoded delaysBest Practices
Match Retry Policy to Operation Type
Match Retry Policy to Operation Type
Different operations need different retry strategies:
Design for Failure Propagation
Design for Failure Propagation
Structure task dependencies to handle failures gracefully:
Monitor Retry Patterns
Monitor Retry Patterns
Track retry behavior to tune policies:
Set Reasonable Maximum Delays
Set Reasonable Maximum Delays
Avoid extremely long delays that block workflows:
Related
Structured LLM Errors
Handle LLM failure classification
Workflow Errors
Workflow-level error handling

