Skip to main content
The runtime-resolution subsystem maps agent and model references to concrete runtime instances at turn-time, enabling dynamic model switching and custom routing logic.
Handoffs no longer use this subsystem. As of PR #2178, handoffs always delegate directly to agent.chat() / agent.achat(). The runtime-resolution layer is used by other parts of the SDK for agent-level runtime configuration. If you came here for handoff execution behaviour, see Agent Handoffs.

Quick Start

1

Swap an agent model mid-conversation

Update the target model at any time — the next invocation automatically picks up the change because agent.chat() reads the live llm value.
from praisonaiagents import Agent

researcher = Agent(
    name="Researcher",
    instructions="Research the topic and summarise it",
    llm="gpt-4o-mini",
)

writer = Agent(
    name="Writer",
    instructions="Write a polished article",
    llm="gpt-4o-mini",
    handoffs=[researcher],
)

# Swap the researcher to a different model at any point
researcher.llm = "claude-3-sonnet"

# The next invocation automatically uses claude-3-sonnet for the researcher
writer.start("Research and write about ocean currents")
2

Inspect the runtime cache

Use get_runtime_cache and clear_runtime_cache to debug or force a fresh resolution.
from praisonaiagents.runtime import get_runtime_cache, clear_runtime_cache

# See what runtimes are cached across sessions
cache = get_runtime_cache()
for session_id, entries in cache.items():
    for cache_key, (runtime, cached_at) in entries.items():
        print(f"{cache_key}: {runtime.provider}/{runtime.model_ref}")

# Force fresh resolution for a specific session
clear_runtime_cache(session_id="session_123")

# Clear all cached runtimes
clear_runtime_cache()
3

Custom resolver (advanced)

Override the built-in resolver to control how models map to runtimes.
from praisonaiagents import Agent
from praisonaiagents.runtime import (
    set_global_resolver,
    SessionContext,
)
from praisonaiagents.runtime.resolve import (
    RuntimeResolver,
    AgentRuntimeProtocol,
    LLMRuntimeWrapper,
)
from praisonaiagents.llm.llm import LLM

class MyResolver(RuntimeResolver):
    def supports_model(self, model_ref: str) -> bool:
        return model_ref.startswith(("gpt-", "claude-", "my-model-"))

    def resolve(self, agent_id, model_ref, session_ctx, **kwargs):
        # Route "my-model-*" to a custom endpoint
        if model_ref.startswith("my-model-"):
            llm = LLM(model="gpt-4o-mini", api_base="https://my.endpoint/v1")
        else:
            llm = LLM(model=model_ref)
        return LLMRuntimeWrapper(llm=llm, model_ref=model_ref, agent_id=agent_id)

set_global_resolver(MyResolver())

agent = Agent(name="MyAgent", instructions="Help users", llm="my-model-fast")
agent.start("Hello!")

How It Works

The subsystem reads the agent’s current llm (or model) attribute at invocation time, not at construction time.
Handoffs always execute via agent.chat() / agent.achat() directly — they do not call into this subsystem. Runtime resolution is used for agent-level model configuration and caching, not for handoff execution.

Configuration Options

SessionContext

Passed to resolve_runtime to scope caching and track depth.
FieldTypeDefaultDescription
session_idstrRequired. Used as the first segment of the cache key
timestampfloattime.time() if <= 0Session start time
parent_agent_idOptional[str]NoneName of the agent that triggered resolution
handoff_depthint0Current nesting depth

Cache constants

ConstantValueMeaning
_cache_ttl_seconds300Each cached runtime lives for 5 minutes
_cleanup_interval600Background cleanup daemon runs every 10 minutes
Cache keys use the format "{session_id}:{agent_id}:{model_ref}" — caches are session-isolated so different conversations never share runtimes.

Common Patterns

Mid-conversation model swap

from praisonaiagents import Agent

analyst = Agent(name="Analyst", instructions="Analyse data", llm="gpt-4o-mini")
coordinator = Agent(name="Coordinator", instructions="Coordinate", handoffs=[analyst])

# First few turns use gpt-4o-mini
coordinator.start("Quick summary of Q1 sales")

# Upgrade to a more capable model for a detailed report
analyst.llm = "gpt-4o"
coordinator.start("Full analysis of Q1 vs Q2 with trend forecasting")

Force cache refresh

from praisonaiagents.runtime import clear_runtime_cache

# After rotating API keys or changing model config
clear_runtime_cache()

agent.start("Continue with the updated model settings")

Introspect resolved runtimes

from praisonaiagents.runtime import get_runtime_cache

cache = get_runtime_cache()
total = sum(len(entries) for entries in cache.values())
print(f"Active runtimes: {total} across {len(cache)} sessions")

Best Practices

Model re-resolution happens at each invocation boundary. Changing agent.llm is effective immediately for the next call — no restart needed.
The 5-minute TTL means old runtimes may linger after you rotate API keys. Call clear_runtime_cache() to evict all entries and force fresh connections.
Return False from supports_model for models you do not handle. The built-in DefaultRuntimeResolver acts as the final fallback, so returning False simply delegates back to it.
If you need to change how a handoff target executes, configure the target agent itself (instructions, tools, llm). The target agent’s full chat() pipeline runs on every handoff — this subsystem is not in that path.

Public API

All names are importable from praisonaiagents.runtime:
from praisonaiagents.runtime import (
    resolve_runtime,
    SessionContext,
    RuntimeProtocol,
    get_runtime_cache,
    clear_runtime_cache,
    set_global_resolver,
)

resolve_runtime

def resolve_runtime(
    agent_id: str,
    model_ref: str,
    session_ctx: SessionContext,
    **kwargs,
) -> AgentRuntimeProtocol:
    ...
The main entry point. Checks the TTL cache first; creates and caches a new runtime if none exists or the entry expired.

RuntimeProtocol / AgentRuntimeProtocol

Protocols that custom runtimes must satisfy. Only relevant when building a custom resolver.
class RuntimeProtocol(Protocol):
    def execute(self, prompt: str, **kwargs) -> Any: ...
    async def aexecute(self, prompt: str, **kwargs) -> Any: ...
    @property
    def model_ref(self) -> str: ...
    @property
    def provider(self) -> str: ...

class AgentRuntimeProtocol(RuntimeProtocol):
    @property
    def supports_streaming(self) -> bool: ...
    @property
    def supports_tools(self) -> bool: ...

Agent Handoffs

Agent-to-agent delegation — handoffs always use agent.chat()

HandoffConfig reference

HandoffConfig reference

Secure tool boundaries during handoff

Secure tool boundaries during handoff

Filter context passed during handoff

Filter context passed during handoff