Quick Start
Works by default
Circuit breaker protection is automatically enabled for every tool call with zero configuration needed.
Detect open circuit
When a tool fails 5 times consecutively, subsequent calls return an error dictionary instead of calling the tool.
How It Works
| State | Behavior |
|---|---|
| CLOSED | Normal operation - all calls pass through |
| OPEN | Tool calls blocked - returns error dict |
| HALF_OPEN | Recovery mode - limited probe calls allowed |
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
failure_threshold | int | 5 | Failures before opening |
recovery_timeout | float | 60.0 | Seconds before half-open probe |
success_threshold | int | 2 | Successes in half-open to close |
timeout | float | 30.0 | Per-call timeout |
monitor_window | float | 300.0 | Failure-rate window |
enable_health_check | bool | True | Periodic health checks |
health_check_interval | float | 30.0 | Health-check interval |
graceful_degradation | bool | True | Return error dict instead of raising |
What Does NOT Trip the Breaker
Circuit breakers ignore certain error types to avoid false positives:- Approval denied errors - User permission issues don’t indicate tool problems
- Permission denied errors - Access control failures aren’t tool failures
- Approval process errors - User workflow issues shouldn’t trigger circuit breaking
Common Patterns
- Observability
- Custom Config Per Tool
- Global Reset
Monitor circuit breaker health and statistics for debugging.
Best Practices
Don't disable in production
Don't disable in production
Circuit breakers prevent cascading failures and protect system stability. Keep them enabled in production environments to ensure reliable agent operation.
Monitor circuit breaker stats
Monitor circuit breaker stats
Track circuit breaker statistics in your monitoring systems. Frequent openings indicate underlying tool reliability issues that need attention.
Reset between test runs
Reset between test runs
Call
reset_all_circuit_breakers() in test teardown to ensure clean state. This prevents test failures from affecting subsequent test runs.Surface circuit_open to users
Surface circuit_open to users
When handling
circuit_open: true responses, provide clear user feedback about temporary tool unavailability and suggest retry timeframes or alternative approaches.Related
Model Failover
Automatic LLM provider switching
Error Handling
Comprehensive error handling strategies

