Skip to main content

Token Estimation

PraisonAI provides fast, offline token estimation that works without API calls. This enables real-time context budget tracking and optimization decisions.

Quick Start

from praisonaiagents.context import (
    estimate_tokens_heuristic,
    estimate_messages_tokens,
    estimate_tool_schema_tokens,
)

# Estimate tokens for text
text = "Hello, how are you today?"
tokens = estimate_tokens_heuristic(text)
print(f"Estimated: {tokens} tokens")

# Estimate tokens for messages
messages = [
    {"role": "system", "content": "You are helpful."},
    {"role": "user", "content": "What is Python?"},
]
tokens = estimate_messages_tokens(messages)
print(f"Messages: {tokens} tokens")

# Estimate tool schema tokens
tools = [
    {"name": "read_file", "description": "Read a file"},
    {"name": "write_file", "description": "Write to a file"},
]
tokens = estimate_tool_schema_tokens(tools)
print(f"Tools: {tokens} tokens")

Estimation Algorithm

The heuristic estimator uses character-based rules optimized for typical LLM tokenization:
Character TypeTokens per Character
ASCII text~0.25 (4 chars = 1 token)
Non-ASCII (Unicode)~1.3 tokens per char
WhitespaceCounted normally

Message Overhead

Each message includes overhead for role markers and formatting:
  • Base overhead: 4 tokens per message
  • Role tokens: ~2 tokens
  • Content: Estimated via heuristic

API Reference

estimate_tokens_heuristic(text: str) -> int

Estimate tokens for a string using character-based heuristics.
tokens = estimate_tokens_heuristic("Hello world!")
# Returns: ~3 tokens

estimate_messages_tokens(messages: List[Dict]) -> int

Estimate total tokens for a list of chat messages.
messages = [
    {"role": "user", "content": "Hello"},
    {"role": "assistant", "content": "Hi there!"},
]
tokens = estimate_messages_tokens(messages)

estimate_tool_schema_tokens(tools: List[Dict]) -> int

Estimate tokens for tool/function schemas.
tools = [{"name": "search", "description": "Search the web"}]
tokens = estimate_tool_schema_tokens(tools)

TokenEstimatorImpl

Class-based estimator with caching:
from praisonaiagents.context import TokenEstimatorImpl

estimator = TokenEstimatorImpl()
tokens = estimator.estimate("Some text")

get_estimator() -> TokenEstimatorImpl

Get a singleton estimator instance:
from praisonaiagents.context import get_estimator

estimator = get_estimator()
tokens = estimator.estimate("Text to estimate")

Accuracy Considerations

The heuristic estimator is designed for speed over perfect accuracy:
ScenarioAccuracy
English text~90-95%
Code~85-90%
Mixed content~85-90%
Non-ASCII heavy~80-85%
For budget decisions, the estimator adds a small safety margin to prevent underestimation.

Performance

  • Speed: < 1ms for 100K characters
  • Memory: O(1) - no caching required
  • No API calls: Works completely offline

Integration with Budgeter

from praisonaiagents.context import (
    ContextBudgeter,
    estimate_messages_tokens,
)

budgeter = ContextBudgeter(model="gpt-4o-mini")
budget = budgeter.allocate()

# Check if messages fit in budget
messages = [...]  # Your conversation
tokens = estimate_messages_tokens(messages)

if tokens > budget.usable * 0.8:
    print("Warning: Approaching context limit!")

Next Steps