Runtime Resolution

The runtime-resolution subsystem maps agent and model references to concrete runtime instances at turn-time, enabling dynamic model switching and custom routing logic.

Handoffs no longer use this subsystem. As of PR #2178, handoffs always delegate directly to agent.chat() / agent.achat(). The runtime-resolution layer is used by other parts of the SDK for agent-level runtime configuration. If you came here for handoff execution behaviour, see Agent Handoffs.

Quick Start

Swap an agent model mid-conversation

Update the target model at any time — the next invocation automatically picks up the change because agent.chat() reads the live llm value.

from praisonaiagents import Agent

researcher = Agent(
    name="Researcher",
    instructions="Research the topic and summarise it",
    llm="gpt-4o-mini",
)

writer = Agent(
    name="Writer",
    instructions="Write a polished article",
    llm="gpt-4o-mini",
    handoffs=[researcher],
)

# Swap the researcher to a different model at any point
researcher.llm = "claude-3-sonnet"

# The next invocation automatically uses claude-3-sonnet for the researcher
writer.start("Research and write about ocean currents")

Inspect the runtime cache

Use get_runtime_cache and clear_runtime_cache to debug or force a fresh resolution.

from praisonaiagents.runtime import get_runtime_cache, clear_runtime_cache

# See what runtimes are cached across sessions
cache = get_runtime_cache()
for session_id, entries in cache.items():
    for cache_key, (runtime, cached_at) in entries.items():
        print(f"{cache_key}: {runtime.provider}/{runtime.model_ref}")

# Force fresh resolution for a specific session
clear_runtime_cache(session_id="session_123")

# Clear all cached runtimes
clear_runtime_cache()

Custom resolver (advanced)

Override the built-in resolver to control how models map to runtimes.

from praisonaiagents import Agent
from praisonaiagents.runtime import (
    set_global_resolver,
    SessionContext,
)
from praisonaiagents.runtime.resolve import (
    RuntimeResolver,
    AgentRuntimeProtocol,
    LLMRuntimeWrapper,
)
from praisonaiagents.llm.llm import LLM

class MyResolver(RuntimeResolver):
    def supports_model(self, model_ref: str) -> bool:
        return model_ref.startswith(("gpt-", "claude-", "my-model-"))

    def resolve(self, agent_id, model_ref, session_ctx, **kwargs):
        # Route "my-model-*" to a custom endpoint
        if model_ref.startswith("my-model-"):
            llm = LLM(model="gpt-4o-mini", api_base="https://my.endpoint/v1")
        else:
            llm = LLM(model=model_ref)
        return LLMRuntimeWrapper(llm=llm, model_ref=model_ref, agent_id=agent_id)

set_global_resolver(MyResolver())

agent = Agent(name="MyAgent", instructions="Help users", llm="my-model-fast")
agent.start("Hello!")

How It Works

The subsystem reads the agent’s current llm (or model) attribute at invocation time, not at construction time.

Handoffs always execute via agent.chat() / agent.achat() directly — they do not call into this subsystem. Runtime resolution is used for agent-level model configuration and caching, not for handoff execution.

Configuration Options

`SessionContext`

Passed to resolve_runtime to scope caching and track depth.

Field	Type	Default	Description
`session_id`	`str`	—	Required. Used as the first segment of the cache key
`timestamp`	`float`	`time.time()` if `<= 0`	Session start time
`parent_agent_id`	`Optional[str]`	`None`	Name of the agent that triggered resolution
`handoff_depth`	`int`	`0`	Current nesting depth

Cache constants

Constant	Value	Meaning
`_cache_ttl_seconds`	`300`	Each cached runtime lives for 5 minutes
`_cleanup_interval`	`600`	Background cleanup daemon runs every 10 minutes

Cache keys use the format "{session_id}:{agent_id}:{model_ref}" — caches are session-isolated so different conversations never share runtimes.

Common Patterns

Mid-conversation model swap

from praisonaiagents import Agent

analyst = Agent(name="Analyst", instructions="Analyse data", llm="gpt-4o-mini")
coordinator = Agent(name="Coordinator", instructions="Coordinate", handoffs=[analyst])

# First few turns use gpt-4o-mini
coordinator.start("Quick summary of Q1 sales")

# Upgrade to a more capable model for a detailed report
analyst.llm = "gpt-4o"
coordinator.start("Full analysis of Q1 vs Q2 with trend forecasting")

Force cache refresh

from praisonaiagents.runtime import clear_runtime_cache

# After rotating API keys or changing model config
clear_runtime_cache()

agent.start("Continue with the updated model settings")

Introspect resolved runtimes

from praisonaiagents.runtime import get_runtime_cache

cache = get_runtime_cache()
total = sum(len(entries) for entries in cache.values())
print(f"Active runtimes: {total} across {len(cache)} sessions")

Best Practices

Change llm before starting a new turn

Model re-resolution happens at each invocation boundary. Changing agent.llm is effective immediately for the next call — no restart needed.

Use clear_runtime_cache after credential rotation

The 5-minute TTL means old runtimes may linger after you rotate API keys. Call clear_runtime_cache() to evict all entries and force fresh connections.

Implement supports_model narrowly in custom resolvers

Return False from supports_model for models you do not handle. The built-in DefaultRuntimeResolver acts as the final fallback, so returning False simply delegates back to it.

Handoffs execute via agent.chat() — not this subsystem

If you need to change how a handoff target executes, configure the target agent itself (instructions, tools, llm). The target agent’s full chat() pipeline runs on every handoff — this subsystem is not in that path.

Public API

All names are importable from praisonaiagents.runtime:

from praisonaiagents.runtime import (
    resolve_runtime,
    SessionContext,
    RuntimeProtocol,
    get_runtime_cache,
    clear_runtime_cache,
    set_global_resolver,
)

`resolve_runtime`

def resolve_runtime(
    agent_id: str,
    model_ref: str,
    session_ctx: SessionContext,
    **kwargs,
) -> AgentRuntimeProtocol:
    ...

The main entry point. Checks the TTL cache first; creates and caches a new runtime if none exists or the entry expired.

`RuntimeProtocol` / `AgentRuntimeProtocol`

Protocols that custom runtimes must satisfy. Only relevant when building a custom resolver.

class RuntimeProtocol(Protocol):
    def execute(self, prompt: str, **kwargs) -> Any: ...
    async def aexecute(self, prompt: str, **kwargs) -> Any: ...
    @property
    def model_ref(self) -> str: ...
    @property
    def provider(self) -> str: ...

class AgentRuntimeProtocol(RuntimeProtocol):
    @property
    def supports_streaming(self) -> bool: ...
    @property
    def supports_tools(self) -> bool: ...

Runtime Resolution

Quick Start

How It Works

Configuration Options

`SessionContext`

Cache constants

Common Patterns

Mid-conversation model swap

Force cache refresh

Introspect resolved runtimes

Best Practices

Public API

`resolve_runtime`

`RuntimeProtocol` / `AgentRuntimeProtocol`

Agent Handoffs

HandoffConfig reference

Secure tool boundaries during handoff

Filter context passed during handoff

​Quick Start

​How It Works

​Configuration Options

​SessionContext

​Cache constants

​Common Patterns

​Mid-conversation model swap

​Force cache refresh

​Introspect resolved runtimes

​Best Practices

​Public API

​resolve_runtime

​RuntimeProtocol / AgentRuntimeProtocol

​Related

Agent Handoffs

HandoffConfig reference

Secure tool boundaries during handoff

Filter context passed during handoff

Quick Start

How It Works

Configuration Options

`SessionContext`

Cache constants

Common Patterns

Mid-conversation model swap

Force cache refresh

Introspect resolved runtimes

Best Practices

Public API

`resolve_runtime`

`RuntimeProtocol` / `AgentRuntimeProtocol`

Related