Skip to main content
Gateway agents use chat-optimised defaults to ensure fast response times for interactive chat applications.

Quick Start

1

Fast by default

Create a minimal bot.yaml with fast defaults:
agents:
  assistant:
    instructions: "You are a helpful AI assistant."
    model: gpt-4o-mini
Response latency: ~1.6 seconds on short prompts.
2

Opt in to reflection

Add reflection for higher quality responses:
agents:
  assistant:
    instructions: "You are a helpful AI assistant."
    model: gpt-4o-mini
    reflection: true   # opt-in: enables self-critique
Response latency: ~12.3 seconds (8x slower for better quality).

How It Works

Gateway agents optimize for chat channel performance with different flow patterns:
ModeReflectionLatencyQuality
Fast (default)false~1.6sDirect response
Quality (opt-in)true~12.3sSelf-critiqued response

Configuration Options

Gateway agents loaded from YAML use chat-optimised defaults:
YAML keyGateway defaultSDK defaultWhy different
reflectionfalsefalseChat channels need sub-second replies
tool_choicenull (auto)null (auto)Let LLM decide when to call tools
allow_delegationfalsefalsePrevents routing unless opted in

Common Patterns

Plain Chat Assistant

agents:
  assistant:
    instructions: "You are a helpful AI assistant."
    model: gpt-4o-mini
    # reflection defaults to false - fast responses

High-Quality Q&A Bot

agents:
  expert:
    instructions: "You are an expert consultant. Think carefully before answering."
    model: gpt-4o
    reflection: true  # opt-in for quality over speed

Mixed Agent Configuration

agents:
  quickhelp:
    instructions: "Handle basic questions quickly."
    model: gpt-4o-mini
    # reflection=false (default) - fast responses
    
  research:
    instructions: "Conduct thorough research and analysis."
    model: gpt-4o
    reflection: true  # enable self-critique for research

Best Practices

Gateway agents default to reflection: false because chat applications prioritize response speed. Enable reflection only for agents handling complex analysis or research tasks.
Each reflection cycle requires additional API calls. For gpt-4o-mini, this increases latency from ~1.6s to ~12.3s. Test your specific use case to validate the quality improvement justifies the speed tradeoff.
Configure reflection per agent based on their role. Quick support agents should stay fast, while research or analysis agents benefit from reflection.
Gateway defaults optimize for interactive chat. For batch processing, long-running tasks, or non-interactive workflows, use the Python SDK directly where different performance characteristics apply.

Gateway

Gateway configuration and deployment

Reflection

Understanding reflection and self-critique

Bot OS

Bot operating system concepts

Messaging Bots

Chat platform integrations