> ## Documentation Index
> Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Profiling

> Comprehensive performance profiling for PraisonAI agents

PraisonAI includes a powerful profiling module for measuring and analyzing agent performance. Profile function execution, API calls, streaming latency, memory usage, and more.

## Quick Start

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler, profile

# Enable profiling
Profiler.enable()

# Profile a function
@profile
def my_agent_task():
    # Your agent code here
    pass

# Profile a block of code
with Profiler.block("agent_initialization"):
    agent = Agent(instructions="You are helpful")

# Get report
Profiler.report()
```

## Features

<CardGroup cols={2}>
  <Card title="Function Profiling" icon="function">
    Measure execution time of any function with decorators
  </Card>

  <Card title="API Call Profiling" icon="globe">
    Track wall-clock time for HTTP/API calls
  </Card>

  <Card title="Streaming Profiling" icon="stream">
    Measure Time To First Token (TTFT) and total streaming time
  </Card>

  <Card title="Memory Profiling" icon="memory">
    Track memory usage with tracemalloc integration
  </Card>

  <Card title="Statistics" icon="chart-line">
    Get p50, p95, p99 percentiles and statistical analysis
  </Card>

  <Card title="Export" icon="file-export">
    Export reports as JSON, HTML, or SVG flamegraphs
  </Card>
</CardGroup>

## Production-Ready Bounded Buffers

```mermaid theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
graph LR
    subgraph "Memory-Safe Profiling"
        A[🤖 Agent runs] --> B[📊 Record timing]
        B --> C{🧱 Buffer full?}
        C -->|No| D[💾 Store in deque]
        C -->|Yes| E[🗑️ Evict oldest]
        E --> D
        D --> F[📤 Export recent data]
    end
    
    classDef agent fill:#8B0000,stroke:#7C90A0,color:#fff
    classDef process fill:#189AB4,stroke:#7C90A0,color:#fff
    classDef storage fill:#F59E0B,stroke:#7C90A0,color:#fff
    classDef output fill:#10B981,stroke:#7C90A0,color:#fff
    
    class A agent
    class B,C,E process
    class D storage
    class F output
```

PraisonAI's profiler uses **bounded ring buffers** to prevent memory leaks in long-running production workloads. Each profiling buffer (`_timings`, `_imports`, `_flow`, `_api_calls`, `_streaming`, `_memory`, `_cprofile_stats`) is implemented as `deque(maxlen=PRAISONAI_PROFILE_MAX)`.

**Key benefits:**

* **Memory-safe**: Fixed maximum memory usage regardless of runtime
* **Production-ready**: No unbounded growth in long-running agents
* **Configurable**: Adjust `PRAISONAI_PROFILE_MAX` based on your RAM budget
* **Ring buffer**: Only keeps the most recent N records per buffer

## Environment Variables

Enable profiling globally and configure buffer size:

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
export PRAISONAI_PROFILE=1
export PRAISONAI_PROFILE_MAX=10000  # Set max records per buffer
```

| Variable                | Type        | Default | Description                                                                                                                                                                                          |
| ----------------------- | ----------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `PRAISONAI_PROFILE`     | `int`       | `0`     | Enable (1) or disable (0) profiling                                                                                                                                                                  |
| `PRAISONAI_PROFILE_MAX` | `int` (≥ 1) | `10000` | Max records kept per buffer (`_timings`, `_imports`, `_flow`, `_api_calls`, `_streaming`, `_memory`, `_cprofile_stats`). When full, oldest records are dropped. Invalid values fall back to `10000`. |

## Function Profiling

### Basic Decorator

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import profile, profile_async

@profile
def sync_function():
    """Automatically profiled when enabled."""
    return "result"

@profile_async
async def async_function():
    """Profile async functions."""
    return await some_async_call()

# With custom category
@profile(category="llm_call")
def call_llm():
    pass
```

### Block Profiling

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler

with Profiler.block("data_processing"):
    # Code to profile
    process_data()

with Profiler.block("model_inference", category="inference"):
    result = model.predict(data)
```

## API Call Profiling

Track HTTP/API call latency with wall-clock time:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler, profile_api

# Using decorator
@profile_api(endpoint="openai/chat/completions")
def call_openai():
    response = client.chat.completions.create(...)
    return response

# Using context manager
with Profiler.api_call("https://api.openai.com/v1/chat/completions", method="POST") as call:
    response = requests.post(url, json=data)
    call['status_code'] = response.status_code
    call['response_size'] = len(response.content)

# Get all API calls
api_calls = Profiler.get_api_calls()
for call in api_calls:
    print(f"{call.endpoint}: {call.duration_ms:.2f}ms")
```

## Streaming Profiling

Measure Time To First Token (TTFT) and streaming performance:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler, StreamingTracker

# Using context manager
with Profiler.streaming("chat_completion") as tracker:
    for chunk in stream:
        if tracker._first_token_time is None:
            tracker.first_token()  # Mark TTFT
        tracker.chunk()
        process(chunk)

# Manual tracking
tracker = StreamingTracker("my_stream")
tracker.start()

for i, chunk in enumerate(response_stream):
    if i == 0:
        tracker.first_token()
    tracker.chunk()
    
tracker.end(total_tokens=150)

# Get streaming records
streams = Profiler.get_streaming_records()
for s in streams:
    print(f"TTFT: {s.ttft_ms:.2f}ms, Total: {s.total_ms:.2f}ms, Chunks: {s.chunk_count}")
```

## Memory Profiling

Track memory usage with tracemalloc:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler

# Profile memory for a block
with Profiler.memory("agent_creation"):
    agent = Agent(instructions="...", tools=[...])

# Get memory records
memories = Profiler.get_memory_records()
for m in memories:
    print(f"{m.name}: current={m.current_kb:.1f}KB, peak={m.peak_kb:.1f}KB")

# Take a snapshot
snapshot = Profiler.memory_snapshot()
print(f"Current: {snapshot['current_kb']:.1f}KB")
print(f"Peak: {snapshot['peak_kb']:.1f}KB")
```

## Statistics

Get statistical analysis of profiling data:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler

# Get overall statistics
stats = Profiler.get_statistics()
print(f"P50 (Median): {stats['p50']:.2f}ms")
print(f"P95: {stats['p95']:.2f}ms")
print(f"P99: {stats['p99']:.2f}ms")
print(f"Mean: {stats['mean']:.2f}ms")
print(f"Std Dev: {stats['std_dev']:.2f}ms")

# Get statistics for specific category
api_stats = Profiler.get_statistics(category="api")
llm_stats = Profiler.get_statistics(category="llm_call")
```

## cProfile Integration

For detailed function-level profiling:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler, profile_detailed

# Using decorator
@profile_detailed
def heavy_computation():
    return sum(i * i for i in range(100000))

# Using context manager
with Profiler.cprofile("agent_run") as stats:
    result = agent.run()

# Get cProfile stats
cprofile_data = Profiler.get_cprofile_stats()
for entry in cprofile_data:
    print(f"Operation: {entry['name']}")
    print(entry['stats'])
```

## Line-Level Profiling

Profile individual lines (requires `line_profiler` package):

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import profile_lines

@profile_lines
def detailed_function():
    a = expensive_operation_1()  # Line timing
    b = expensive_operation_2()  # Line timing
    return a + b

# Get line profile data
line_data = Profiler.get_line_profile_data()
```

<Note>
  Install `line_profiler` for full functionality: `pip install line_profiler`
</Note>

## Reports and Export

### Console Report

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler

# Print to console
Profiler.report()

# Get as string
report_text = Profiler.report(output="string")
```

### JSON Export

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Export as JSON string
json_report = Profiler.export_json()

# Save to file
Profiler.export_to_file("profile_report.json", format="json")
```

### HTML Export

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Export as HTML string
html_report = Profiler.export_html()

# Save to file
Profiler.export_to_file("profile_report.html", format="html")
```

### Flamegraph

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Export flamegraph as SVG
Profiler.export_flamegraph("profile.svg")
```

<Tip>
  For production flamegraphs, use py-spy:

  ```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
  py-spy record -o profile.svg -- python your_script.py
  ```
</Tip>

## Import Profiling

Profile module import times:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import profile_imports, time_import

# Profile imports in a block
with profile_imports() as profiler:
    import pandas
    import numpy
    from praisonaiagents import Agent

# Get slowest imports
slowest = profiler.get_slowest(n=5)
for imp in slowest:
    print(f"{imp.module}: {imp.duration_ms:.2f}ms")

# Quick single import timing
duration = time_import("torch")
print(f"torch import: {duration:.2f}ms")
```

## Zero Performance Impact

When profiling is disabled, there is **zero performance overhead**:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler, profile

Profiler.disable()  # Profiling off

@profile
def fast_function():
    return 1 + 1

# No overhead - decorator is a no-op when disabled
for _ in range(1000000):
    fast_function()  # Full speed
```

## Complete Example

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonai.profiler import Profiler, profile, profile_api
from praisonaiagents import Agent

# Enable profiling
Profiler.enable()

@profile_api(endpoint="openai/chat")
def create_completion(prompt):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )

@profile(category="agent")
def run_agent(task):
    with Profiler.memory("agent_init"):
        agent = Agent(instructions="You are helpful")
    
    with Profiler.block("agent_execution"):
        result = agent.chat(task)
    
    return result

# Run with profiling
result = run_agent("Explain quantum computing")

# Get comprehensive report
print("\n=== Profiling Report ===")
Profiler.report()

# Get statistics
stats = Profiler.get_statistics()
print(f"\nP95 Latency: {stats['p95']:.2f}ms")

# Export detailed report
Profiler.export_to_file("agent_profile.html", format="html")
Profiler.export_flamegraph("agent_flamegraph.svg")

# Cleanup
Profiler.disable()
Profiler.clear()
```

## Production Best Practices

<AccordionGroup>
  <Accordion title="Set PRAISONAI_PROFILE_MAX based on RAM budget">
    Each record uses roughly 100-200 bytes. Default 10k records ≈ 1-2 MB per buffer (7 buffers total = \~7-14 MB). For memory-constrained environments, use smaller values.

    ```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
    # For high-memory production servers
    export PRAISONAI_PROFILE_MAX=100000

    # For memory-constrained containers  
    export PRAISONAI_PROFILE_MAX=1000

    # For development (default)
    export PRAISONAI_PROFILE_MAX=10000
    ```
  </Accordion>

  <Accordion title="Disable profiling by default in production">
    Only enable profiling when debugging performance issues. Production agents should avoid the overhead.

    ```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
    import os
    from praisonai.profiler import Profiler

    # Only enable in debug/staging environments
    if os.environ.get("DEBUG") == "true" or os.environ.get("ENVIRONMENT") == "staging":
        Profiler.enable()
    ```
  </Accordion>

  <Accordion title="Export periodically for long-running processes">
    Ring buffers only retain recent records. Export data periodically to avoid losing historical performance data.

    ```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
    import time
    import threading
    from praisonai.profiler import Profiler

    def periodic_export():
        """Export profiling data every hour."""
        while True:
            time.sleep(3600)  # 1 hour
            if Profiler.is_enabled():
                filename = f"profile_{int(time.time())}.json"
                Profiler.export_to_file(filename, format="json")
                print(f"Exported profiling data to {filename}")

    # Start background export thread
    export_thread = threading.Thread(target=periodic_export, daemon=True)
    export_thread.start()
    ```
  </Accordion>

  <Accordion title="Monitor buffer overflow in production">
    Check if you're hitting buffer limits and adjust `PRAISONAI_PROFILE_MAX` accordingly.

    ```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
    from praisonai.profiler import Profiler

    # Check buffer usage
    timing_records = len(Profiler.get_timings())
    api_records = len(Profiler.get_api_calls())
    max_buffer = os.environ.get("PRAISONAI_PROFILE_MAX", "10000")

    print(f"Buffer usage: {timing_records}/{max_buffer} timing records")
    print(f"API calls: {api_records}/{max_buffer} records")

    # If close to max, consider increasing buffer size or exporting more frequently
    if timing_records > int(max_buffer) * 0.8:
        print("Warning: Timing buffer nearly full, consider increasing PRAISONAI_PROFILE_MAX")
    ```
  </Accordion>
</AccordionGroup>

## API Reference

### Profiler Class Methods

| Method                | Description                             |
| --------------------- | --------------------------------------- |
| `enable()`            | Enable profiling                        |
| `disable()`           | Disable profiling                       |
| `clear()`             | Clear all profiling data                |
| `is_enabled()`        | Check if profiling is enabled           |
| `record_timing()`     | Record a timing measurement             |
| `record_api_call()`   | Record an API call                      |
| `record_streaming()`  | Record streaming metrics                |
| `record_memory()`     | Record memory usage                     |
| `block()`             | Context manager for block profiling     |
| `api_call()`          | Context manager for API call profiling  |
| `streaming()`         | Context manager for streaming profiling |
| `memory()`            | Context manager for memory profiling    |
| `cprofile()`          | Context manager for cProfile            |
| `get_statistics()`    | Get statistical analysis                |
| `get_summary()`       | Get profiling summary                   |
| `report()`            | Generate console report                 |
| `export_json()`       | Export as JSON                          |
| `export_html()`       | Export as HTML                          |
| `export_flamegraph()` | Export as SVG flamegraph                |

### Decorators

| Decorator            | Description            |
| -------------------- | ---------------------- |
| `@profile`           | Profile sync function  |
| `@profile_async`     | Profile async function |
| `@profile_api`       | Profile as API call    |
| `@profile_api_async` | Profile async API call |
| `@profile_detailed`  | Profile with cProfile  |
| `@profile_lines`     | Line-level profiling   |

### Data Classes

| Class             | Fields                                                 |
| ----------------- | ------------------------------------------------------ |
| `TimingRecord`    | name, duration\_ms, category, file, line               |
| `APICallRecord`   | endpoint, method, duration\_ms, status\_code           |
| `StreamingRecord` | name, ttft\_ms, total\_ms, chunk\_count, total\_tokens |
| `MemoryRecord`    | name, current\_kb, peak\_kb                            |
