Skip to main content

Thinking Budgets

Configure and manage token budgets for extended thinking in LLM reasoning. Control how much “thinking” an agent can do based on task complexity.

Quick Start

from praisonaiagents.thinking import ThinkingBudget, ThinkingTracker

# Use predefined budget levels
budget = ThinkingBudget.medium()  # 8,000 tokens
print(f"Max tokens: {budget.max_tokens}")

# Track usage across sessions
tracker = ThinkingTracker()
session = tracker.start_session(budget_tokens=8000)
tracker.end_session(session, tokens_used=5000, time_seconds=30.0)

summary = tracker.get_summary()
print(f"Average utilization: {summary['average_utilization']:.1%}")

Budget Levels

Pre-configured budget levels for different use cases:
from praisonaiagents.thinking import ThinkingBudget

# Available levels
minimal = ThinkingBudget.minimal()   # 2,000 tokens
low = ThinkingBudget.low()           # 4,000 tokens
medium = ThinkingBudget.medium()     # 8,000 tokens
high = ThinkingBudget.high()         # 16,000 tokens
maximum = ThinkingBudget.maximum()   # 32,000 tokens

Custom Budgets

budget = ThinkingBudget(
    max_tokens=12000,        # Maximum thinking tokens
    min_tokens=2000,         # Minimum allocation
    adaptive=True,           # Scale based on complexity
    max_time_seconds=120.0   # Time limit
)

Adaptive Scaling

Budgets can scale based on task complexity:
budget = ThinkingBudget(
    max_tokens=16000,
    min_tokens=2000,
    adaptive=True
)

# Get tokens for different complexity levels (0.0 to 1.0)
simple = budget.get_tokens_for_complexity(0.2)   # ~4,800 tokens
medium = budget.get_tokens_for_complexity(0.5)   # ~9,000 tokens
complex = budget.get_tokens_for_complexity(0.9)  # ~14,600 tokens

Usage Tracking

Track thinking usage across sessions:
from praisonaiagents.thinking import ThinkingUsage, ThinkingTracker

# Single usage record
usage = ThinkingUsage(
    budget_tokens=8000,
    tokens_used=5000,
    time_seconds=30.0,
    budget_time=60.0
)

print(f"Utilization: {usage.token_utilization:.1%}")
print(f"Over budget: {usage.is_over_budget}")
print(f"Tokens remaining: {usage.tokens_remaining}")

Session Tracking

tracker = ThinkingTracker()

# Track multiple sessions
for task in tasks:
    session = tracker.start_session(budget_tokens=8000)
    # ... agent thinking ...
    tracker.end_session(session, tokens_used=actual_tokens)

# Get aggregate statistics
summary = tracker.get_summary()
print(f"Total sessions: {summary['session_count']}")
print(f"Total tokens: {summary['total_tokens_used']}")
print(f"Avg utilization: {summary['average_utilization']:.1%}")
print(f"Over-budget: {summary['over_budget_count']}")

Serialization

from praisonaiagents.thinking.budget import BudgetLevel

# Create from level
budget = ThinkingBudget.from_level(BudgetLevel.HIGH)

# Serialize
data = budget.to_dict()

# Restore
restored = ThinkingBudget.from_dict(data)

Agent Integration

from praisonaiagents import Agent
from praisonaiagents.thinking import ThinkingBudget

agent = Agent(
    instructions="Complex reasoning assistant",
    thinking_budget=ThinkingBudget.high()
)

Zero Performance Impact

Thinking budgets use lazy loading:
# Only loads when accessed
from praisonaiagents.thinking import ThinkingBudget