The --prompt-caching flag enables prompt caching to reduce costs when using repeated or long system prompts.
Quick Start
praisonai "Analyze this document..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
Basic Prompt Caching
praisonai "Analyze this document..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
Expected Output:
๐พ Prompt Caching enabled
โญโ Agent Info โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ ๐ค Agent: DirectAgent โ
โ Role: Assistant โ
โ Model: anthropic/claude-sonnet-4-20250514 โ
โ Prompt Caching: Enabled โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Cache Status โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ ๐ Cache hit: System prompt (1,024 tokens saved) โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Combine with Metrics
# See cost savings with metrics
praisonai "Process data..." --prompt-caching --metrics --llm anthropic/claude-sonnet-4-20250514
Supported Providers
| Provider | Support | Notes |
|---|
| OpenAI | Auto | Automatic caching for repeated prompts |
| Anthropic | Manual | Explicit caching with --prompt-caching |
| Bedrock | Manual | Explicit caching support |
| Deepseek | Manual | Explicit caching support |
How It Works
- Enable: The
--prompt-caching flag activates caching
- Hash: System prompt is hashed for cache lookup
- Check: Provider checks if prompt is cached
- Reuse: Cached prompts skip re-processing
- Save: Reduced token costs for cached portions
Cost Savings
Prompt caching can significantly reduce costs for:
| Scenario | Savings |
|---|
| Long system prompts | Up to 90% |
| Repeated instructions | Up to 80% |
| Document analysis | Up to 70% |
| Multi-turn conversations | Up to 50% |
Examples
Long System Prompt
# Agent with extensive instructions benefits from caching
praisonai "Answer questions about the codebase" \
--prompt-caching --llm anthropic/claude-sonnet-4-20250514
Document Analysis
# Repeated analysis of same document
praisonai "Find security issues in this code..." \
--prompt-caching --llm anthropic/claude-sonnet-4-20250514
Multi-Query Session
# Multiple queries with same context
praisonai "Query 1..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
praisonai "Query 2..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
praisonai "Query 3..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
Programmatic Usage
from praisonaiagents import Agent
agent = Agent(
instructions="You are an AI assistant..." * 50, # Long system prompt
llm="anthropic/claude-sonnet-4-20250514",
prompt_caching=True
)
# First call caches the prompt
result1 = agent.start("Question 1")
# Subsequent calls use cached prompt
result2 = agent.start("Question 2") # Reduced cost
result3 = agent.start("Question 3") # Reduced cost
Best Practices
Use prompt caching when you have long system prompts or make repeated calls with the same context.
Caching is most effective for stable prompts. Frequently changing prompts wonโt benefit from caching.
| Do | Donโt |
|---|
| Use for long system prompts | Use for short prompts |
| Use for repeated queries | Use for one-off queries |
Combine with --metrics to track savings | Ignore cost monitoring |
| Use stable instructions | Change prompts frequently |
Cache Behavior
| Provider | Cache Duration | Cache Scope |
|---|
| OpenAI | Automatic | Per-request |
| Anthropic | 5 minutes | Per-session |
| Bedrock | Configurable | Per-session |
| Deepseek | 5 minutes | Per-session |