Quick Start
How It Works
When a client reconnects, the gateway checks whether the session has queued inbox messages or an in-flight execution. If so, it sends astatus frame, marks the session as executing before spawning the queue task (preventing duplicate workers), and processes pending messages in FIFO order.
Graceful Drain on Shutdown
WebSocketGateway.stop() calls _drain_active_sessions before closing WebSocket clients. The gateway waits up to drain_timeout seconds (default 10.0) for sessions that are executing or have a non-empty inbox.
| Phase | Behaviour |
|---|---|
| Within timeout | Sessions that finish are persisted via the configured session store |
| After timeout | Remaining sessions are force-persisted with pending work; a SESSION_END event is emitted |
SESSION_END payload:
_session_store so drained sessions survive restarts. Without one, force-closed sessions are only logged.
Reconnect and Auto-Resume
On resume, clients receive:mark_executing(True) before launching asyncio.create_task(_run_session_queue(...)), so a race between reconnect and shutdown cannot spawn duplicate queue processors.
Persisted Shape
Gateway session snapshots now include pending work:Configuration
| Option | Type | Default | Description |
|---|---|---|---|
drain_timeout | float | 10.0 | Seconds to wait for in-flight sessions during WebSocketGateway.stop() before force-persisting |
Best Practices
Tune drain_timeout for redeploys
Tune drain_timeout for redeploys
Set
drain_timeout to at least your longest expected agent turn so clean shutdowns finish naturally before force-persist.Configure a session store
Configure a session store
Without persistence, drained sessions cannot be resumed after restart — only logged.
Monitor SESSION_END events
Monitor SESSION_END events
Listen for
had_pending_work: true or was_executing: true in audit or observability hooks to detect force-closed sessions.Test reconnect under load
Test reconnect under load
Send messages faster than the agent processes them, then drop the WebSocket — pending messages should resume in order on reconnect.
Related
Gateway
WebSocket control plane overview
Session Persistence
Persistent sessions and event replay
Error Handling
Reconnect and error recovery
Session Protocol
Session message format

