Quick Start
Fast by default
Create a minimal Response latency: ~1.6 seconds on short prompts.
bot.yaml with fast defaults:How It Works
Gateway agents optimize for chat channel performance with different flow patterns:| Mode | Reflection | Latency | Quality |
|---|---|---|---|
| Fast (default) | false | ~1.6s | Direct response |
| Quality (opt-in) | true | ~12.3s | Self-critiqued response |
Configuration Options
Gateway agents loaded from YAML use chat-optimised defaults:| YAML key | Gateway default | SDK default | Why different |
|---|---|---|---|
reflection | false | false | Chat channels need sub-second replies |
tool_choice | null (auto) | null (auto) | Let LLM decide when to call tools |
allow_delegation | false | false | Prevents routing unless opted in |
Common Patterns
Plain Chat Assistant
High-Quality Q&A Bot
Mixed Agent Configuration
Best Practices
Default is fast — only enable reflection when quality matters more than latency
Default is fast — only enable reflection when quality matters more than latency
Gateway agents default to
reflection: false because chat applications prioritize response speed. Enable reflection only for agents handling complex analysis or research tasks.Measure before opting in: reflection adds 1..max_reflect=3 extra LLM round-trips per message
Measure before opting in: reflection adds 1..max_reflect=3 extra LLM round-trips per message
Each reflection cycle requires additional API calls. For
gpt-4o-mini, this increases latency from ~1.6s to ~12.3s. Test your specific use case to validate the quality improvement justifies the speed tradeoff.Use reflection selectively per agent, not globally
Use reflection selectively per agent, not globally
Configure reflection per agent based on their role. Quick support agents should stay fast, while research or analysis agents benefit from reflection.
For background / long-form tasks use the Python SDK directly — it has different defaults suited to that use case
For background / long-form tasks use the Python SDK directly — it has different defaults suited to that use case
Gateway defaults optimize for interactive chat. For batch processing, long-running tasks, or non-interactive workflows, use the Python SDK directly where different performance characteristics apply.
Related
Gateway
Gateway configuration and deployment
Reflection
Understanding reflection and self-critique
Bot OS
Bot operating system concepts
Messaging Bots
Chat platform integrations

