Skip to main content
Model Failover automatically switches between LLM providers when one fails, ensuring your agents remain operational even during API outages or rate limits.

Quick Start

1

Configure Auth Profiles

from praisonaiagents import AuthProfile, FailoverManager

# Create profiles for different providers
openai = AuthProfile(
    name="openai",
    provider="openai",
    api_key="sk-...",
    priority=1
)

anthropic = AuthProfile(
    name="anthropic", 
    provider="anthropic",
    api_key="sk-ant-...",
    priority=2
)
2

Setup Failover Manager

from praisonaiagents import FailoverConfig, FailoverManager

config = FailoverConfig(
    max_retries=3,
    retry_delay=1.0,
    exponential_backoff=True
)

manager = FailoverManager(config)
manager.add_profile(openai)
manager.add_profile(anthropic)
3

Use with Agent

from praisonaiagents import Agent

agent = Agent(
    name="assistant",
    failover=manager
)

# Automatically fails over on errors
response = agent.start("Hello!")

How It Works

ComponentRole
AuthProfileCredentials for a single provider
FailoverManagerOrchestrates failover logic
FailoverConfigRetry and backoff settings
ProviderStatusTracks provider health

Configuration Options

from praisonaiagents import FailoverConfig

config = FailoverConfig(
    max_retries=3,              # Max retry attempts
    retry_delay=1.0,            # Initial delay (seconds)
    exponential_backoff=True,   # Enable exponential backoff
    max_retry_delay=60.0,       # Max delay between retries
    failover_on_rate_limit=True,# Failover on 429 errors
    failover_on_timeout=True,   # Failover on timeouts
    failover_on_error=True,     # Failover on other errors
)
OptionTypeDefaultDescription
max_retriesint3Maximum retry attempts
retry_delayfloat1.0Initial retry delay
exponential_backoffboolTrueUse exponential backoff
max_retry_delayfloat60.0Maximum retry delay
failover_on_rate_limitboolTrueFailover on 429
failover_on_timeoutboolTrueFailover on timeout

Auth Profiles

Configure credentials for each provider:
from praisonaiagents import AuthProfile

profile = AuthProfile(
    name="openai-primary",
    provider="openai",
    api_key="sk-...",
    base_url=None,           # Custom endpoint (optional)
    priority=1,              # Lower = higher priority
    weight=1.0,              # For load balancing
    rate_limit=100,          # Requests per minute
    metadata={}              # Custom metadata
)
FieldTypeDescription
namestrUnique profile identifier
providerstrProvider: openai, anthropic, etc.
api_keystrAPI key (masked in logs)
base_urlstrCustom API endpoint
priorityintFailover priority (1 = highest)
weightfloatLoad balancing weight
rate_limitintRate limit (requests/min)

Common Patterns

from praisonaiagents import AuthProfile, FailoverManager

manager = FailoverManager()

# Add multiple providers
manager.add_profile(AuthProfile(
    name="openai",
    provider="openai",
    api_key="sk-...",
    priority=1
))

manager.add_profile(AuthProfile(
    name="anthropic",
    provider="anthropic", 
    api_key="sk-ant-...",
    priority=2
))

manager.add_profile(AuthProfile(
    name="groq",
    provider="groq",
    api_key="gsk-...",
    priority=3
))

Failover Callbacks

React to failover events:
from praisonaiagents import FailoverManager, FailoverConfig

def on_failover(from_profile, to_profile, error):
    print(f"Failing over from {from_profile} to {to_profile}")
    print(f"Reason: {error}")
    # Log to monitoring system
    
config = FailoverConfig(
    on_failover=on_failover
)

manager = FailoverManager(config)

Provider Status

Monitor provider health:
from praisonaiagents import FailoverManager

manager = FailoverManager()

# Get status of all providers
status = manager.status()
for name, info in status.items():
    print(f"{name}: {info['status']}")
    print(f"  Failures: {info['failure_count']}")
    print(f"  Last success: {info['last_success']}")

# Reset a provider after recovery
manager.mark_success("openai")

# Reset all providers
manager.reset_all()

Best Practices

Always have at least 2-3 providers configured. This ensures availability even during major outages.
Enable exponential_backoff=True to avoid hammering providers during issues. This helps you stay within rate limits.
Order providers by cost and reliability. Put cheaper/faster providers first, with premium providers as fallback.
Use the on_failover callback to track when failovers occur. This helps identify provider issues early.