> ## Documentation Index
> Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Benchmark

> Comprehensive performance benchmarking for PraisonAI

# Benchmark CLI

The `praisonai benchmark` command provides comprehensive performance benchmarking across all PraisonAI execution paths, comparing them against the raw OpenAI SDK baseline.

## Quick Start

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Quick comparison of key paths
praisonai benchmark compare "Hi"

# Full benchmark suite (all 8 paths)
praisonai benchmark profile "What is 2+2?"

# Benchmark specific paths
praisonai benchmark agent "Hi"
praisonai benchmark cli "Hi"
praisonai benchmark workflow "Hi"
praisonai benchmark litellm "Hi"
```

## Commands

### `benchmark profile`

Run the full benchmark suite across all execution paths.

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark profile "What is 2+2?" --iterations 3
```

**Options:**

* `--iterations, -n`: Number of iterations per path (default: 3)
* `--format, -f`: Output format: `text` or `json` (default: text)
* `--output, -o`: Save results to file

**Paths benchmarked:**

1. OpenAI SDK (baseline)
2. PraisonAI Agent
3. PraisonAI CLI
4. PraisonAI CLI with profiling
5. PraisonAI Workflow (single agent)
6. PraisonAI Workflow (multi-agent)
7. PraisonAI via LiteLLM
8. LiteLLM standalone

### `benchmark compare`

Quick comparison of key execution paths.

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark compare "Hi" --iterations 2
```

Compares: OpenAI SDK, PraisonAI Agent, PraisonAI CLI, LiteLLM standalone.

### `benchmark sdk`

Benchmark OpenAI SDK only (baseline).

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark sdk "Hi" --iterations 3 --format json
```

### `benchmark agent`

Benchmark PraisonAI Agent vs SDK baseline.

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark agent "Hi" --iterations 3
```

### `benchmark cli`

Benchmark PraisonAI CLI vs SDK baseline.

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark cli "Hi" --iterations 3
```

### `benchmark workflow`

Benchmark PraisonAI Workflow (single and multi-agent) vs SDK baseline.

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark workflow "Hi" --iterations 3
```

### `benchmark litellm`

Benchmark LiteLLM paths vs SDK baseline.

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark litellm "Hi" --iterations 3
```

## Output Formats

### Text Output (Default)

```
======================================================================
## Master Comparison Table
+------------------------------+----------+----------+----------+----------+------------+
| Path                         |   Import |     Init |  Network |    Total |      Δ SDK |
+------------------------------+----------+----------+----------+----------+------------+
| praisonai_agent              |    373ms |      0ms |    808ms |   1182ms |      -88ms |
| openai_sdk                   |    290ms |     40ms |    939ms |   1269ms |   baseline |
+------------------------------+----------+----------+----------+----------+------------+
```

### JSON Output

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
praisonai benchmark agent "Hi" --format json > results.json
```

```json theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
{
  "timestamp": "2026-01-02T06:14:46.182126Z",
  "prompt": "Hi",
  "iterations": 3,
  "sdk_baseline_ms": 1269.0,
  "results": {
    "openai_sdk": {
      "path_name": "openai_sdk",
      "mean_total_ms": 1269.0,
      "min_total_ms": 1185.0,
      "max_total_ms": 1354.0,
      "std_total_ms": 119.0,
      "mean_import_ms": 290.0,
      "mean_init_ms": 40.0,
      "mean_network_ms": 939.0,
      "cold_total_ms": 1354.0,
      "warm_total_ms": 1185.0,
      "delta_vs_sdk_ms": 0.0
    }
  }
}
```

## Timeline Diagrams

Each benchmark path includes an ASCII timeline diagram showing execution phases:

```
ENTER ───────────────────────────────────────────────────► RESPONSE
      │    import    │init│             network             │
      │    373ms     │0ms│              808ms              │
      └──────────────┴┴─────────────────────────────────┘
                                        TOTAL: 1182ms
```

## Variance Analysis

The benchmark includes statistical analysis:

```
+------------------------------+----------+----------+----------+----------+------------+
| Path                         |     Mean |      Min |      Max |   StdDev |  Cold/Warm |
+------------------------------+----------+----------+----------+----------+------------+
| praisonai_agent              |   1182ms |   1138ms |   1225ms |     62ms |  1138/1225 |
| openai_sdk                   |   1269ms |   1185ms |   1354ms |    119ms |  1354/1185 |
+------------------------------+----------+----------+----------+----------+------------+
```

## Overhead Classification

The benchmark classifies overhead into categories:

* **Unavoidable**: Network latency, TLS handshake, provider response time
* **Framework**: praisonaiagents import, LiteLLM import/config
* **CLI**: Subprocess spawn, Python startup, argument parsing
* **Profiling**: cProfile overhead when `--profile` enabled

## Python API Usage

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.cli.features.benchmark import BenchmarkHandler

handler = BenchmarkHandler()

# Run full benchmark
report = handler.run_full_benchmark(
    prompt="What is 2+2?",
    iterations=3,
    
)

# Print report
handler.print_report(report)

# Get comparison table
print(handler.create_comparison_table(report))

# Get variance analysis
print(handler.create_variance_table(report))

# Export to JSON
import json
print(json.dumps(report.to_dict(), indent=2))
```

## Example Script

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
#!/usr/bin/env python3
"""Benchmark PraisonAI Agent vs OpenAI SDK."""

from praisonai.cli.features.benchmark import BenchmarkHandler

handler = BenchmarkHandler()

# Benchmark agent vs SDK
report = handler.run_full_benchmark(
    prompt="Explain Python in one sentence",
    iterations=3,
    paths=["openai_sdk", "praisonai_agent"],
    
)

# Show results
for name, result in report.results.items():
    print(f"\n{name}: {result.mean_total_ms:.0f}ms (±{result.std_total_ms:.0f}ms)")
```

## Deep Profiling (--deep)

The `--deep` flag enables comprehensive cProfile-based profiling, providing:

* **Per-function timing** with self time and cumulative time
* **Call counts** for each function
* **Module breakdown** by category (PraisonAI, Agent, Network, Third-party)
* **Call graph data** with caller/callee relationships

### Deep Profile Output

```
## Deep Profile: Top Functions by Cumulative Time
--------------------------------------------------------------------------------
Function                                         Calls    Self (ms)   Cumul (ms)
--------------------------------------------------------------------------------
start                                                1         0.03       875.69
chat                                                 1         0.03       875.66
_chat_completion                                     1         0.02       875.57
create_completion                                    1         0.01       875.54
--------------------------------------------------------------------------------

## Module Breakdown (by cumulative time)
------------------------------------------------------------
PraisonAI Agent Modules:
  .../praisonaiagents/agent/agent.py                        2626.92ms

Network Modules:
  .../openai/_base_client.py                                1712.23ms
  .../httpx/_client.py                                       855.20ms

## Call Graph: 1293 edges
```

### Deep Profile JSON Schema

```json theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
{
  "results": {
    "praisonai_agent": {
      "functions": [
        {
          "name": "start",
          "file": "/path/to/agent.py",
          "line": 123,
          "calls": 1,
          "total_time_ms": 0.03,
          "cumulative_time_ms": 875.69
        }
      ],
      "call_graph": {
        "callers": {"func:file:line": ["caller1", "caller2"]},
        "callees": {"func:file:line": ["callee1", "callee2"]},
        "edge_count": 1293
      },
      "module_breakdown": {
        "praisonai": [{"file": "...", "cumulative_ms": 100.0}],
        "agent": [{"file": "...", "cumulative_ms": 200.0}],
        "network": [{"file": "...", "cumulative_ms": 300.0}],
        "third_party": [{"file": "...", "cumulative_ms": 50.0}]
      }
    }
  }
}
```

## Best Practices

1. **Run multiple iterations**: Use at least 3 iterations for reliable statistics
2. **Account for cold starts**: First run is typically slower due to imports
3. **Use consistent prompts**: Same prompt across paths for fair comparison
4. **Check network variance**: Network latency can vary significantly
5. **Save JSON results**: Use `--format json` for programmatic analysis
6. **Use --deep for debugging**: Deep profiling adds overhead but provides function-level insights

## See Also

* [Profile Command](/cli/profiling) - Detailed function-level profiling
* [Doctor Command](/cli/doctor) - Health checks and diagnostics
