AI Agents with Context

PraisonAI provides industry-leading context management with smart defaults, lazy loading, and 6 optimization strategies.

Feature	PraisonAI	LangChain	CrewAI	Agno
Smart Defaults	✅	❌	❌	❌
Lazy Loading (0ms)	✅	❌	❌	❌
6 Strategies	✅	❌	❌	❌
Per-Tool Budgets	✅	❌	❌	❌
Session Deduplication	✅	❌	❌	⚠️
LLM Summarization	✅	⚠️	❌	❌

Quick Start

from praisonaiagents import Agent

# Enable with defaults (auto-enabled when tools present)
agent = Agent(
    instructions="You are helpful",
    context=True
)

What is Context?

Context is everything sent to the LLM in a single API call. It includes:

System Prompt

Agent instructions, role, and goals (~2K tokens)

Chat History

User/assistant messages (variable)

Tool Schemas

Function definitions (~2K tokens)

Tool Outputs

Results from tool calls (~20K tokens)

Memory/RAG

Retrieved context (~4K tokens)

Output Reserve

Space for LLM response (~8-16K tokens)

How Context Flows

Single Agent Flow

Multi-Agent Flow

Optimization Strategies

When context exceeds the threshold (default 80%), the optimizer kicks in:

Strategy	How It Works	Best For
`truncate`	Remove oldest messages	Simple chatbots
`sliding_window`	Keep N recent messages	Long conversations
`prune_tools`	Truncate old tool outputs	Tool-heavy agents
`summarize`	LLM summarizes old context	Critical context
`smart`	Combines all strategies	Production use
`non_destructive`	Tag for exclusion (undo-able)	Audit trails

Smart Strategy Flow

Overflow Handling

Level	Usage	Action
Normal	< 70%	No action
Warning	70-80%	Monitor
Critical	80-90%	Auto-compact triggers
Emergency	90-95%	Aggressive optimization
Overflow	> 95%	Emergency truncation

Token Budgeting

The Context Budgeter allocates tokens across segments:

from praisonaiagents import ContextBudgeter

budgeter = ContextBudgeter(model="gpt-4o-mini")
budget = budgeter.allocate()

print(f"Model limit: {budget.model_limit:,}")      # 128,000
print(f"Output reserve: {budget.output_reserve:,}") # 16,384
print(f"Usable: {budget.usable:,}")                 # 111,616

Model Limits

Model	Context	Output Reserve
gpt-4o	128K	16K
gpt-4o-mini	128K	16K
claude-3-opus	200K	8K
gemini-1.5-pro	2M	8K

Per-Tool Budgets

Set different limits for different tools:

from praisonaiagents import Agent, ManagerConfig

agent = Agent(
    instructions="You are helpful",
    context=ManagerConfig(
        tool_limits={
            "tavily_search": 2000,    # Search: 2K chars
            "tavily_extract": 5000,   # Full page: 5K chars
            "code_executor": 10000,   # Code output: 10K chars
        },
        protected_tools=["file_read"],  # Never pruned
    )
)

Session Deduplication

Prevents duplicate content across agents in multi-agent workflows:

Multi-Agent Policies

Control how context is shared between agents:

Mode	Description	Use Case
`NONE`	No context shared	Independent agents
`SUMMARY`	Summarized context	Reduce tokens
`FULL`	Full context (bounded)	Continuity needed

from praisonaiagents import ContextPolicy, ContextShareMode

policy = ContextPolicy(
    share=True,
    share_mode=ContextShareMode.SUMMARY,
    max_tokens=5000,
    preserve_recent_turns=3,
)

Context Monitoring

Real-time snapshots for debugging:

from praisonaiagents import Agent, ManagerConfig

agent = Agent(
    instructions="You are helpful",
    context=ManagerConfig(
        monitor_enabled=True,
        monitor_path="./context.txt",
        monitor_format="human",  # or "json"
        redact_sensitive=True,
    )
)

Snapshot Output

================================================================================
PRAISONAI CONTEXT SNAPSHOT
================================================================================
Timestamp: 2026-01-24T06:00:00Z
Model: gpt-4o-mini
Model Limit: 128,000 tokens
Usable Budget: 111,616 tokens

--------------------------------------------------------------------------------
TOKEN LEDGER
--------------------------------------------------------------------------------
Segment              |     Tokens |     Budget |    Usage
--------------------------------------------------------------------------------
System Prompt        |        150 |      2,000 |    7.5%
History              |      5,230 |     84,616 |    6.2%
Tool Outputs         |      1,200 |     20,000 |    6.0%
--------------------------------------------------------------------------------
TOTAL                |      6,580 |    111,616 |    5.9%

Token Estimation

Fast offline token counting (no API calls):

from praisonaiagents import (
    estimate_tokens_heuristic,
    estimate_messages_tokens,
)

# Estimate text tokens
tokens = estimate_tokens_heuristic("Hello world!")  # ~3

# Estimate message tokens
messages = [
    {"role": "user", "content": "Hello"},
    {"role": "assistant", "content": "Hi!"},
]
tokens = estimate_messages_tokens(messages)  # ~12

Content Type	Accuracy
English text	~90-95%
Code	~85-90%
Non-ASCII	~80-85%

Fast Context (Code Search)

Rapid parallel code search for AI agents:

from praisonaiagents.context.fast import FastContext

fc = FastContext(
    workspace_path=".",
    max_turns=4,
    max_parallel=8,
)

result = fc.search("find authentication handlers")
print(f"Found {result.total_files} files in {result.search_time_ms}ms")

Feature	Value
Search Latency	100-200ms
Cache Hit	< 1ms
Parallel Speedup	2-5x

CLI Commands

# Enable context in chat
praisonai chat --context

# Set strategy
praisonai chat --context-strategy smart

# Set threshold
praisonai chat --context-threshold 0.8

# Enable monitoring
praisonai chat --context-monitor

In-Session Commands

Command	Description
`/context on`	Enable monitoring
`/context off`	Disable monitoring
`/context stats`	Show token ledger
`/context dump`	Write snapshot now
`/context compact`	Force optimization

Configuration Reference

ManagerConfig Options

Option	Type	Default	Description
`auto_compact`	bool	`True`	Auto-optimize on threshold
`compact_threshold`	float	`0.8`	Trigger at this usage %
`strategy`	str	`"smart"`	Optimization strategy
`output_reserve`	int	Model-specific	Reserved for output
`llm_summarize`	bool	`False`	Use LLM for summarization
`tool_limits`	dict	`{}`	Per-tool token limits
`protected_tools`	list	`[]`	Tools never pruned
`monitor_enabled`	bool	`False`	Enable snapshots
`redact_sensitive`	bool	`True`	Redact secrets

Environment Variables

PRAISONAI_CONTEXT_OUTPUT_RESERVE=8000
PRAISONAI_CONTEXT_THRESHOLD=0.8
PRAISONAI_CONTEXT_MONITOR=true

Best Practices

Enable for tool-heavy agents

Always enable context=True for agents with tools to prevent token overflow from large search results.

Use smart strategy in production

The smart strategy combines all optimization techniques intelligently.

Set per-tool limits

Configure lower limits for verbose tools (search, web scraping) and higher limits for code execution.

Monitor during development

Enable monitor_enabled=True to debug context issues and understand token usage.

Use session deduplication

In multi-agent workflows, deduplication prevents the same content from being processed multiple times.

Memory

Persistent storage across sessions

Knowledge

Pre-loaded reference documents

Context vs Memory

When to use each system

Context vs Knowledge

Runtime vs pre-loaded data

Context management uses lazy loading throughout. Setting context=True adds only 1 boolean assignment at creation time (0ms). The ContextManager is only instantiated when first accessed.

Getting Started

Core Concepts

Guides

Features

Models

Databases

Observability

Memory

Knowledge

RAG

Persistence

Tools

Other Features

Developers

Configuration

Best Practices

Getting Started (No Code)

​AI Agents with Context

​Quick Start

​What is Context?

System Prompt

Chat History

Tool Schemas

Tool Outputs

Memory/RAG

Output Reserve

​How Context Flows

​Single Agent Flow

​Multi-Agent Flow

​Optimization Strategies

​Smart Strategy Flow

​Overflow Handling

​Token Budgeting

​Model Limits

​Per-Tool Budgets

​Session Deduplication

​Multi-Agent Policies

​Context Monitoring

​Snapshot Output

​Token Estimation

​Fast Context (Code Search)

​CLI Commands

​In-Session Commands

​Configuration Reference

​ManagerConfig Options

​Environment Variables

​Best Practices

​Related Pages

Memory

Knowledge

Context vs Memory

Context vs Knowledge

AI Agents with Context

Quick Start

What is Context?

How Context Flows

Single Agent Flow

Multi-Agent Flow

Optimization Strategies

Smart Strategy Flow

Overflow Handling

Token Budgeting

Model Limits

Per-Tool Budgets

Session Deduplication

Multi-Agent Policies

Context Monitoring

Snapshot Output

Token Estimation

Fast Context (Code Search)

CLI Commands

In-Session Commands

Configuration Reference

ManagerConfig Options

Environment Variables

Best Practices

Related Pages