Skip to main content

Context Optimizer

The Context Optimizer provides multiple strategies for reducing context size when approaching model limits, preventing overflow errors and reducing costs.

Agent-Centric Quick Start

from praisonaiagents import Agent
from praisonaiagents import ManagerConfig

# Configure optimization strategy via context param
agent = Agent(
    instructions="You are helpful.",
    context=ManagerConfig(
        auto_compact=True,
        compact_threshold=0.8,  # Trigger at 80% usage
        strategy="smart",  # truncate, sliding_window, prune_tools, summarize, smart
    ),
)

# Optimization happens automatically when threshold is reached
response = agent.chat("Hello!")

Optimization Strategies

Smart Strategy Flow (Default)

Low-Level API

from praisonaiagents import get_optimizer, OptimizerStrategy

# Get an optimizer
optimizer = get_optimizer(OptimizerStrategy.SMART)

# Optimize messages to target token count
messages = [...]  # Your conversation history
optimized, stats = optimizer.optimize(messages, target_tokens=50000)

print(f"Reduced from {len(messages)} to {len(optimized)} messages")
print(f"Saved {stats.tokens_saved} tokens")

Strategy Reference

Truncate

Removes oldest messages first, preserving system prompt and recent context.
from praisonaiagents import TruncateOptimizer

optimizer = TruncateOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Simple cases where old context is not important.

Sliding Window

Keeps the N most recent messages within a token window.
from praisonaiagents import SlidingWindowOptimizer

optimizer = SlidingWindowOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Conversations where recent context matters most.

Prune Tools

Truncates old tool outputs while preserving recent ones.
from praisonaiagents import PruneToolsOptimizer

optimizer = PruneToolsOptimizer(
    protect_recent=5,  # Keep last 5 tool outputs intact
    max_output_tokens=500,  # Truncate older outputs to 500 tokens
)
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Tool-heavy conversations with large outputs.

Summarize

Uses LLM to create a summary of older conversation.
from praisonaiagents import SummarizeOptimizer

optimizer = SummarizeOptimizer(
    keep_recent=4,  # Keep last 4 turns intact
    model="gpt-4o-mini",
)
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Long conversations where context continuity matters.

Conversation

Conversation-aware compaction that preserves topic, goals, decisions, and action items across long sessions.
from praisonaiagents import Agent, ManagerConfig

agent = Agent(
    name="Planner",
    instructions="Help plan products over multi-hour sessions.",
    context=ManagerConfig(
        auto_compact=True,
        strategy="conversation",
        conversation_compaction=True,
        conversation_analyzer_strategy="hybrid",
        conversation_min_compaction_ratio=0.3,
    ),
)
Low-level class usage:
from praisonaiagents import ConversationOptimizer

optimizer = ConversationOptimizer(
    analyzer_strategy="hybrid",
    preserve_recent=5,
    min_compaction_ratio=0.3,
)
result, stats = optimizer.optimize(messages, target_tokens=4000)
Configuration Options:
OptionTypeDefaultDescription
llm_analyze_fnOptional[callable]NoneLLM function for conversation analysis. If None, falls back to rule-based analysis.
min_compaction_ratiofloat0.3Minimum compression ratio to attempt compaction. Below this, falls back to SmartOptimizer.
analyzer_strategystr"hybrid"One of "hybrid", "rule_based", "keyword".
preserve_recentint5Number of recent messages to keep intact.
llm_summarize_fnOptional[callable]NoneLLM function for summarization.
ConversationOptimizer automatically falls back to SmartOptimizer when the compaction ratio is not meaningful (target_tokens / original_tokens > (1 - min_compaction_ratio)) or when internal errors occur during compaction. This ensures safe operation while preserving advanced conversation analysis when beneficial.
Best for: Multi-hour planning, iterative development, long support sessions where topic/goal continuity matters more than literal message history. See Intelligent Conversation Compaction for the full conceptual deep-dive.

LLM Context Compressor

Advanced LLM-driven compression with session lineage and head/tail protection.
from praisonaiagents.context.optimizer import LLMContextCompressorOptimizer

# Must be instantiated directly (not in OPTIMIZER_REGISTRY)
optimizer = LLMContextCompressorOptimizer(
    llm_client=agent.llm,
    auxiliary_model="gpt-4o-mini",
    protect_last_n_tokens=20_000,
    summary_target_tokens=750,
    enable_session_tracking=True,
)
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Long conversations requiring intelligent summarization with audit trails.
The LLMContextCompressorOptimizer is exposed as LLM_CONTEXT_COMPRESSOR_OPTIMIZER but is not in OPTIMIZER_REGISTRY — users must instantiate it directly with an llm_client.
See LLM Context Compression for detailed usage.

Non-Destructive

Tags messages for exclusion without deleting them (enables undo).
from praisonaiagents import NonDestructiveOptimizer

optimizer = NonDestructiveOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)

# Messages are tagged with 'excluded': True
# Use get_effective_history() to filter
Best for: When you need to preserve full history for audit/undo. Combines all strategies intelligently based on content analysis.
from praisonaiagents import SmartOptimizer

optimizer = SmartOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)
Order of operations (Smart Strategy):
  1. Summarize tool outputs - Uses LLM to intelligently summarize large tool outputs (preserves key info)
  2. Truncate tool outputs - Fallback truncation for remaining large outputs
  3. Sliding window - Remove oldest messages
  4. Summarize conversation - LLM summary of older conversation if still over limit
Tool output summarization uses LLM to preserve key information instead of blindly truncating. This is enabled by default when llm_summarize=True.

Factory Function

from praisonaiagents import get_optimizer, OptimizerStrategy

# Available strategies
strategies = [
    OptimizerStrategy.TRUNCATE,
    OptimizerStrategy.SLIDING_WINDOW,
    OptimizerStrategy.PRUNE_TOOLS,
    OptimizerStrategy.SUMMARIZE,
    OptimizerStrategy.NON_DESTRUCTIVE,
    OptimizerStrategy.SMART,
    OptimizerStrategy.CONVERSATION,
]

for strategy in strategies:
    optimizer = get_optimizer(strategy)
    result, stats = optimizer.optimize(messages, target_tokens=10000)
    print(f"{strategy.value}: {len(result)} messages")

Optimization Result

from praisonaiagents import OptimizationResult

# stats returned from optimize()
stats: OptimizationResult
print(f"Original tokens: {stats.original_tokens}")
print(f"Optimized tokens: {stats.optimized_tokens}")
print(f"Tokens saved: {stats.tokens_saved}")
print(f"Strategy used: {stats.strategy_used}")
print(f"Messages removed: {stats.messages_removed}")
print(f"Tool outputs pruned: {stats.tool_outputs_pruned}")
print(f"Tool outputs summarized: {stats.tool_outputs_summarized}")
print(f"Tokens saved by summarization: {stats.tokens_saved_by_summarization}")
print(f"Tokens saved by truncation: {stats.tokens_saved_by_truncation}")

Tool Call Preservation

The optimizer preserves tool_call/tool_result pairs to maintain API validity:
# These pairs are kept together or removed together
{"role": "assistant", "tool_calls": [{"id": "call_123", ...}]}
{"role": "tool", "tool_call_id": "call_123", "content": "..."}

CLI Usage

# Set optimization strategy
praisonai chat --context-strategy smart

# Set trigger threshold
praisonai chat --context-threshold 0.8

# Manual optimization in session
/context compact

Configuration

# config.yaml
context:
  auto_compact: true
  compact_threshold: 0.8
  strategy: smart

Next Steps