Quick Start
How It Works
| Event | Call Site | When It Fires |
|---|---|---|
on_pre_compress | chat_mixin.py:_apply_context_management | Context utilization ≥ compact_threshold |
on_memory_write | agent.py:store_memory | After successful memory storage |
on_delegation | handoff.py (3 paths) | After subagent task completion |
on_session_switch | Not yet wired | Reserved for future session rotation |
The Four Hooks
on_pre_compress
Called before context compression discards messages.compact_threshold and the agent needs to discard older messages.
Example:
on_session_switch
Called when the active session ID changes.on_memory_write
Called after successful memory storage operations.Agent.store_memory() successfully writes to built-in memory.
Example:
on_delegation
Called after a subagent completes a delegated task.Sync vs Async Selection
The agent automatically detects the execution context. If you’re in an async context and provide both sync and async versions, the async version runs as a fire-and-forget task for better performance.The New Action Parameter
Agent.store_memory() now accepts an action parameter:
| Action | Requirements | Description |
|---|---|---|
"add" | Default behavior | Stores new content |
"replace" | Provider must have replace_<type> or update_<type> method | Updates existing content |
"remove" | Provider must have remove_<type> or delete_<type> method | Deletes content |
ValueError is raised.
Configuration Options
All four hooks are optional with no-op defaults. The protocol usesruntime_checkable so your memory class only needs to implement the hooks you need.
| Hook | Arguments | Return Type | When It Fires |
|---|---|---|---|
on_pre_compress | messages: list[dict] | str | Context utilization ≥ threshold |
on_session_switch | new_session_id: str, parent_session_id: str, reset: bool | None | Session ID changes (not yet called) |
on_memory_write | action: str, target: str, content: str, metadata: dict | None | After store_memory() success |
on_delegation | task: str, result: str, agent_name: str, metadata: dict | None | After subagent completion |
Common Patterns
Mirror writes to external vector store:Best Practices
Keep hooks fast
Keep hooks fast
Hooks run on the agent thread and can block execution. Keep operations lightweight or delegate to background workers for heavy processing like vector embeddings or API calls.
Never raise from hooks
Never raise from hooks
Hook failures are caught and logged at warning level. They should never break the agent loop. Always wrap your hook logic in try-catch and log errors appropriately.
Use async variants for I/O
Use async variants for I/O
If your memory provider does network I/O (database calls, API requests), implement the
aon_* async variants. The agent will automatically schedule them as fire-and-forget tasks in async contexts.Implement only what you need
Implement only what you need
All hooks are optional. Only implement the lifecycle events that matter for your use case. Empty implementations have zero performance impact.
Related
Memory
Core memory concepts and storage types
Advanced Memory
Custom backends and advanced patterns
Context Compression
How context compression triggers hooks
Handoffs
Agent delegation and subagent patterns

