Failover now integrates with LLM Error Classification through the new
FailoverDecision struct, which coordinates profile rotation with typed error handling.Quick Start
How failover activates during retries
Failover now drives LLM retries through direct integration with the retry mechanism:- On every LLM call, the system first gets the current profile via
get_next_profile()and applies itsapi_key,base_url, andmodelsettings - On success,
mark_success(profile)is called to track the working provider - On failure,
mark_failure(profile, error, is_rate_limit=...)marks the provider as failed, thenget_next_profile()fetches the next available provider - Profile switching overrides non-retryable classification—one extra attempt is always granted after switching providers
- The LLM automatically updates request parameters (api_key, base_url, model) when switching between profiles
How It Works
| Component | Role |
|---|---|
| AuthProfile | Credentials for a single provider |
| FailoverManager | Orchestrates failover logic |
| FailoverConfig | Retry and backoff settings |
| ProviderStatus | Tracks provider health |
Configuration Options
FailoverManager
Manager class reference
AuthProfile
Provider credential profile
| Option | Type | Default | Description |
|---|---|---|---|
max_retries | int | 3 | Maximum retry attempts |
retry_delay | float | 1.0 | Initial retry delay |
exponential_backoff | bool | True | Use exponential backoff |
max_retry_delay | float | 60.0 | Maximum retry delay |
cooldown_on_rate_limit | float | 60.0 | Rate limit cooldown (seconds) |
cooldown_on_error | float | 30.0 | Error cooldown (seconds) |
rotate_on_success | bool | False | Rotate profiles on success |
Auth Profiles
Configure credentials for each provider:| Field | Type | Description |
|---|---|---|
name | str | Unique profile identifier |
provider | str | Provider: openai, anthropic, etc. |
api_key | str | API key (masked in logs) |
base_url | str | Custom API endpoint |
model | str | Default model for this profile |
priority | int | Failover priority (lower = higher priority) |
rate_limit_rpm | int | Requests per minute limit |
rate_limit_tpm | int | Tokens per minute limit |
metadata | dict | Additional provider-specific config |
Common Patterns
- Multi-Provider
- Cost Optimization
- Regional Failover
Failover Callbacks
React to failover events:Provider Status
Monitor provider health:Best Practices
Configure multiple providers
Configure multiple providers
Always have at least 2-3 providers configured. This ensures availability even during major outages.
Use exponential backoff
Use exponential backoff
Enable
exponential_backoff=True to avoid hammering providers during issues. This helps you stay within rate limits.Set appropriate priorities
Set appropriate priorities
Order providers by cost and reliability. Put cheaper/faster providers first, with premium providers as fallback.
Monitor failover events
Monitor failover events
Use the
on_failover callback to track when failovers occur. This helps identify provider issues early.Integrate with error classification
Integrate with error classification
Pair failover with LLM Error Classification so
FailoverDecision coordinates profile rotation with typed errors.Keep API keys out of source
Keep API keys out of source
Load keys from environment variables or a secrets manager — never commit credentials to version control.
Related
LLM Error Classification
Typed errors that drive failover decisions
Providers
Supported LLM providers

