Skip to main content
Reduce context size automatically before models hit token limits — preventing overflow errors and cutting costs.
from praisonaiagents import Agent, ManagerConfig

agent = Agent(
    instructions="You are helpful.",
    context=ManagerConfig(
        auto_compact=True,
        compact_threshold=0.8,
        strategy="smart",
    ),
)
response = agent.chat("Hello!")

Which Strategy Should I Pick?

Quick Start

1

Enable auto-compaction on an agent

from praisonaiagents import Agent, ManagerConfig

agent = Agent(
    instructions="You are helpful.",
    context=ManagerConfig(
        auto_compact=True,
        compact_threshold=0.8,
        strategy="smart",
    ),
)
agent.chat("Hello!")
2

Pick a strategy explicitly

from praisonaiagents import Agent, ManagerConfig

agent = Agent(
    context=ManagerConfig(strategy="truncate", compact_threshold=0.7),
)
3

Optimise manually with the low-level API

from praisonaiagents import get_optimizer, OptimizerStrategy

optimizer = get_optimizer(OptimizerStrategy.SMART)
optimized, stats = optimizer.optimize(messages, target_tokens=50000)
print(f"Saved {stats.tokens_saved} tokens")

Optimisation Strategies

Smart Strategy Flow (Default)

Low-Level API

from praisonaiagents import get_optimizer, OptimizerStrategy

# Get an optimizer
optimizer = get_optimizer(OptimizerStrategy.SMART)

# Optimize messages to target token count
messages = [...]  # Your conversation history
optimized, stats = optimizer.optimize(messages, target_tokens=50000)

print(f"Reduced from {len(messages)} to {len(optimized)} messages")
print(f"Saved {stats.tokens_saved} tokens")

Strategy Reference

Truncate

Removes oldest messages first, preserving system prompt and recent context.
from praisonaiagents import TruncateOptimizer

optimizer = TruncateOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Simple cases where old context is not important.

Sliding Window

Keeps the N most recent messages within a token window.
from praisonaiagents import SlidingWindowOptimizer

optimizer = SlidingWindowOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Conversations where recent context matters most.

Prune Tools

Truncates old tool outputs while preserving recent ones.
from praisonaiagents import PruneToolsOptimizer

optimizer = PruneToolsOptimizer(
    protect_recent=5,  # Keep last 5 tool outputs intact
    max_output_tokens=500,  # Truncate older outputs to 500 tokens
)
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Tool-heavy conversations with large outputs.

Summarize

Uses LLM to create a summary of older conversation.
from praisonaiagents import SummarizeOptimizer

optimizer = SummarizeOptimizer(
    keep_recent=4,  # Keep last 4 turns intact
    model="gpt-4o-mini",
)
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Long conversations where context continuity matters.

Conversation

Conversation-aware compaction that preserves topic, goals, decisions, and action items across long sessions.
from praisonaiagents import Agent, ManagerConfig

agent = Agent(
    name="Planner",
    instructions="Help plan products over multi-hour sessions.",
    context=ManagerConfig(
        auto_compact=True,
        strategy="conversation",
        conversation_compaction=True,
        conversation_analyzer_strategy="hybrid",
        conversation_min_compaction_ratio=0.3,
    ),
)
Low-level class usage:
from praisonaiagents import ConversationOptimizer

optimizer = ConversationOptimizer(
    analyzer_strategy="hybrid",
    preserve_recent=5,
    min_compaction_ratio=0.3,
)
result, stats = optimizer.optimize(messages, target_tokens=4000)
Configuration Options:
OptionTypeDefaultDescription
llm_analyze_fnOptional[callable]NoneLLM function for conversation analysis. If None, falls back to rule-based analysis.
min_compaction_ratiofloat0.3Minimum compression ratio to attempt compaction. Below this, falls back to SmartOptimizer.
analyzer_strategystr"hybrid"One of "hybrid", "rule_based", "keyword".
preserve_recentint5Number of recent messages to keep intact.
llm_summarize_fnOptional[callable]NoneLLM function for summarization.
ConversationOptimizer automatically falls back to SmartOptimizer when the compaction ratio is not meaningful (target_tokens / original_tokens > (1 - min_compaction_ratio)) or when internal errors occur during compaction. This ensures safe operation while preserving advanced conversation analysis when beneficial.
Best for: Multi-hour planning, iterative development, long support sessions where topic/goal continuity matters more than literal message history. See Intelligent Conversation Compaction for the full conceptual deep-dive.

LLM Context Compressor

Advanced LLM-driven compression with session lineage and head/tail protection.
from praisonaiagents import LLMContextCompressorOptimizer

optimizer = LLMContextCompressorOptimizer(
    llm_client=agent.llm,
    auxiliary_model="gpt-4o-mini",
    protect_last_n_tokens=20_000,
    summary_target_tokens=750,
    enable_session_tracking=True,
)
result, stats = optimizer.optimize(messages, target_tokens=10000)
Best for: Long conversations requiring intelligent summarization with audit trails.
The LLMContextCompressorOptimizer is exposed as LLM_CONTEXT_COMPRESSOR_OPTIMIZER but is not in OPTIMIZER_REGISTRY — users must instantiate it directly with an llm_client.
See LLM Context Compression for detailed usage.

Non-Destructive

Tags messages for exclusion without deleting them (enables undo).
from praisonaiagents import NonDestructiveOptimizer

optimizer = NonDestructiveOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)

# Messages are tagged with 'excluded': True
# Use get_effective_history() to filter
Best for: When you need to preserve full history for audit/undo. Combines all strategies intelligently based on content analysis.
from praisonaiagents import SmartOptimizer

optimizer = SmartOptimizer()
result, stats = optimizer.optimize(messages, target_tokens=10000)
Order of operations (Smart Strategy):
  1. Summarize tool outputs - Uses LLM to intelligently summarize large tool outputs (preserves key info)
  2. Truncate tool outputs - Fallback truncation for remaining large outputs
  3. Sliding window - Remove oldest messages
  4. Summarize conversation - LLM summary of older conversation if still over limit
Tool output summarization uses LLM to preserve key information instead of blindly truncating. This is enabled by default when llm_summarize=True.

Factory Function

from praisonaiagents import get_optimizer, OptimizerStrategy

# Available strategies
strategies = [
    OptimizerStrategy.TRUNCATE,
    OptimizerStrategy.SLIDING_WINDOW,
    OptimizerStrategy.PRUNE_TOOLS,
    OptimizerStrategy.SUMMARIZE,
    OptimizerStrategy.NON_DESTRUCTIVE,
    OptimizerStrategy.SMART,
    OptimizerStrategy.CONVERSATION,
]

for strategy in strategies:
    optimizer = get_optimizer(strategy)
    result, stats = optimizer.optimize(messages, target_tokens=10000)
    print(f"{strategy.value}: {len(result)} messages")

Optimization Result

from praisonaiagents import OptimizationResult

# stats returned from optimize()
stats: OptimizationResult
print(f"Original tokens: {stats.original_tokens}")
print(f"Optimized tokens: {stats.optimized_tokens}")
print(f"Tokens saved: {stats.tokens_saved}")
print(f"Strategy used: {stats.strategy_used}")
print(f"Messages removed: {stats.messages_removed}")
print(f"Tool outputs pruned: {stats.tool_outputs_pruned}")
print(f"Tool outputs summarized: {stats.tool_outputs_summarized}")
print(f"Tokens saved by summarization: {stats.tokens_saved_by_summarization}")
print(f"Tokens saved by truncation: {stats.tokens_saved_by_truncation}")

Tool Call Preservation

The optimizer preserves tool_call/tool_result pairs to maintain API validity:
# These pairs are kept together or removed together
{"role": "assistant", "tool_calls": [{"id": "call_123", ...}]}
{"role": "tool", "tool_call_id": "call_123", "content": "..."}

CLI Usage

# Set optimization strategy
praisonai chat --context-strategy smart

# Set trigger threshold
praisonai chat --context-threshold 0.8

# Manual optimization in session
/context compact

Configuration

# config.yaml
context:
  auto_compact: true
  compact_threshold: 0.8
  strategy: smart

Best Practices

smart combines summarisation, tool pruning, sliding window, and conversation summarisation — use it unless you have a specific reason not to.
Trigger compaction at 0.7–0.8 so optimisation runs before the model hard-fails on context overflow.
Multi-hour planning or support threads benefit from conversation strategy — it preserves topics, goals, and decisions.
All strategies keep tool_calls and matching tool results together so API message history stays valid.

Context Monitor

Watch optimisation in action

Context Budgeter

Set token budgets per session

Optimizer CLI

CLI flags and interactive commands

LLM Context Compression

Advanced LLM-driven compression