Skip to main content
Token estimation validation compares heuristic estimates against accurate counts, logging mismatches for debugging.

Quick Start

from praisonaiagents.context import ContextManager, ManagerConfig, EstimationMode

config = ManagerConfig(
    estimation_mode=EstimationMode.VALIDATED,
    log_estimation_mismatch=True,
    mismatch_threshold_pct=15.0,
)

manager = ContextManager(model="gpt-4o-mini", config=config)

# Estimate with validation
tokens, metrics = manager.estimate_tokens(text, validate=True)

if metrics:
    print(f"Heuristic: {metrics.heuristic_estimate}")
    print(f"Accurate: {metrics.accurate_estimate}")
    print(f"Error: {metrics.error_pct:.1f}%")

Estimation Modes

ModeDescriptionPerformance
HEURISTICFast character-based estimateFastest
ACCURATEUse tiktoken if availableSlower
VALIDATEDCompare both, log mismatchesSlowest

Configuration

config = ManagerConfig(
    estimation_mode=EstimationMode.VALIDATED,
    log_estimation_mismatch=True,      # Log when mismatch > threshold
    mismatch_threshold_pct=15.0,       # 15% threshold
)

Environment Variables

export PRAISONAI_CONTEXT_ESTIMATION_MODE=validated
export PRAISONAI_CONTEXT_LOG_MISMATCH=true

EstimationMetrics

@dataclass
class EstimationMetrics:
    heuristic_estimate: int    # Fast estimate
    accurate_estimate: int     # Tiktoken count
    error_pct: float          # Percentage error
    estimator_used: EstimationMode

Mismatch Logging

When log_estimation_mismatch=True and error exceeds threshold:
WARNING: Token estimation mismatch: heuristic=1250, accurate=1100, error=13.6%

Estimation Caching

Estimates are cached by content hash:
# First call - computes estimate
tokens1, _ = manager.estimate_tokens(text)

# Second call - uses cache
tokens2, _ = manager.estimate_tokens(text)

# Cache key is MD5 hash of text

Heuristic Algorithm

The heuristic uses character-based estimation:
# ASCII characters: ~0.25 tokens per char
# Non-ASCII: ~1.3 tokens per char
# Plus overhead for message structure

Accurate Estimation

When tiktoken is available:
# Uses model-specific tokenizer
# Falls back to heuristic if unavailable

CLI Usage

# View estimation mode in config
praisonai chat
> /context config

# Shows:
# Estimation:
#   estimation_mode:        validated
#   log_mismatch:           True

Best Practices

  1. Use heuristic for production - Fast and good enough
  2. Use validated for debugging - Find estimation issues
  3. Set reasonable threshold - 15-20% is typical
  4. Monitor mismatch logs - Identify problematic content