> ## Documentation Index
> Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Prompt Caching

> Reduce costs for repeated prompts with prompt caching

The `--prompt-caching` flag enables prompt caching to reduce costs when using repeated or long system prompts.

## Quick Start

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai "Analyze this document..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
```

<img src="https://mintcdn.com/praisonai/SX0Y8_-DRBjzOTnt/docs/cli/prompt-caching-prompt-caching-reduces-repeate.gif?s=8061d13a117f762b0a41f7c9f20cd335" alt="Prompt Caching Reduces Repeated Tokens" width="1497" height="1104" data-path="docs/cli/prompt-caching-prompt-caching-reduces-repeate.gif" />

## Usage

### Basic Prompt Caching

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai "Analyze this document..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
```

**Expected Output:**

```
💾 Prompt Caching enabled

╭─ Agent Info ─────────────────────────────────────────────────────────────────╮
│  👤 Agent: DirectAgent                                                       │
│  Role: Assistant                                                             │
│  Model: anthropic/claude-sonnet-4-20250514                                          │
│  Prompt Caching: Enabled                                                     │
╰──────────────────────────────────────────────────────────────────────────────╯

╭─ Cache Status ───────────────────────────────────────────────────────────────╮
│  📊 Cache hit: System prompt (1,024 tokens saved)                           │
╰──────────────────────────────────────────────────────────────────────────────╯
```

### Combine with Metrics

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# See cost savings with metrics
praisonai "Process data..." --prompt-caching --metrics --llm anthropic/claude-sonnet-4-20250514
```

## Supported Providers

| Provider  | Support | Notes                                    |
| --------- | ------- | ---------------------------------------- |
| OpenAI    | Auto    | Automatic caching for repeated prompts   |
| Anthropic | Manual  | Explicit caching with `--prompt-caching` |
| Bedrock   | Manual  | Explicit caching support                 |
| Deepseek  | Manual  | Explicit caching support                 |

## How It Works

1. **Enable**: The `--prompt-caching` flag activates caching
2. **Hash**: System prompt is hashed for cache lookup
3. **Check**: Provider checks if prompt is cached
4. **Reuse**: Cached prompts skip re-processing
5. **Save**: Reduced token costs for cached portions

```mermaid theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
flowchart LR
    A[Request] --> B{Cached?}
    B -->|Yes| C[Use Cache]
    B -->|No| D[Process & Cache]
    C --> E[Reduced Cost]
    D --> F[Full Cost]
    E --> G[Response]
    F --> G
```

## Cost Savings

Prompt caching can significantly reduce costs for:

| Scenario                 | Savings   |
| ------------------------ | --------- |
| Long system prompts      | Up to 90% |
| Repeated instructions    | Up to 80% |
| Document analysis        | Up to 70% |
| Multi-turn conversations | Up to 50% |

## Examples

### Long System Prompt

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Agent with extensive instructions benefits from caching
praisonai "Answer questions about the codebase" \
  --prompt-caching --llm anthropic/claude-sonnet-4-20250514
```

### Document Analysis

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Repeated analysis of same document
praisonai "Find security issues in this code..." \
  --prompt-caching --llm anthropic/claude-sonnet-4-20250514
```

### Multi-Query Session

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Multiple queries with same context
praisonai "Query 1..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
praisonai "Query 2..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
praisonai "Query 3..." --prompt-caching --llm anthropic/claude-sonnet-4-20250514
```

## Programmatic Usage

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonaiagents import Agent

agent = Agent(
    instructions="You are an AI assistant..." * 50,  # Long system prompt
    llm="anthropic/claude-sonnet-4-20250514",
    caching=True
)

# First call caches the prompt
result1 = agent.start("Question 1")

# Subsequent calls use cached prompt
result2 = agent.start("Question 2")  # Reduced cost
result3 = agent.start("Question 3")  # Reduced cost
```

## Best Practices

<Tip>
  Use prompt caching when you have long system prompts or make repeated calls with the same context.
</Tip>

<Warning>
  Caching is most effective for stable prompts. Frequently changing prompts won't benefit from caching.
</Warning>

| Do                                        | Don't                     |
| ----------------------------------------- | ------------------------- |
| Use for long system prompts               | Use for short prompts     |
| Use for repeated queries                  | Use for one-off queries   |
| Combine with `--metrics` to track savings | Ignore cost monitoring    |
| Use stable instructions                   | Change prompts frequently |

## Cache Behavior

| Provider  | Cache Duration | Cache Scope |
| --------- | -------------- | ----------- |
| OpenAI    | Automatic      | Per-request |
| Anthropic | 5 minutes      | Per-session |
| Bedrock   | Configurable   | Per-session |
| Deepseek  | 5 minutes      | Per-session |

## Related

* [Metrics CLI](/cli/metrics)
* [Model Capabilities](/features/model-capabilities)
* [Telemetry CLI](/cli/telemetry)
