Image Processing - PraisonAI Documentation

The --image flag enables image processing with vision-capable AI models.

Quick Start

praisonai "Describe this image" --image path/to/image.png

Usage

Basic Image Analysis

praisonai "What's in this photo?" --image photo.jpg

Expected Output:

🖼️ Processing image: photo.jpg

╭─ Agent Info ─────────────────────────────────────────────────────────────────╮
│  👤 Agent: ImageAgent                                                        │
│  Role: Vision Assistant                                                      │
╰──────────────────────────────────────────────────────────────────────────────╯

╭────────────────────────────────── Response ──────────────────────────────────╮
│ The image shows a golden retriever dog sitting on a grassy lawn. The dog     │
│ appears to be smiling with its tongue out. In the background, there's a      │
│ wooden fence and some trees. The lighting suggests it was taken during       │
│ late afternoon, creating a warm, golden atmosphere.                          │
╰──────────────────────────────────────────────────────────────────────────────╯

Specify Vision Model

# Use GPT-4o for vision
praisonai "Analyze this chart" --image chart.png --llm openai/gpt-4o

# Use Claude for vision
praisonai "Describe the scene" --image scene.jpg --llm anthropic/claude-3-sonnet-20240229

Combine with Other Features

# Image analysis with metrics
praisonai "Count objects" --image warehouse.jpg --metrics

# Image with guardrail
praisonai "Extract text from image" --image document.png --guardrail "Output as JSON"

# Image with save
praisonai "Describe artwork" --image painting.jpg --save

Supported Image Formats

Format	Extension	Support
JPEG	`.jpg`, `.jpeg`	✅ Full
PNG	`.png`	✅ Full
GIF	`.gif`	✅ Static frame
WebP	`.webp`	✅ Full
BMP	`.bmp`	✅ Full

Use Cases

Document Analysis

praisonai "Extract all text from this document" --image invoice.png

Expected Output:

╭────────────────────────────────── Response ──────────────────────────────────╮
│ Invoice #: INV-2024-001                                                      │
│ Date: December 16, 2024                                                      │
│ Customer: Acme Corp                                                          │
│                                                                              │
│ Items:                                                                       │
│ - Widget A x 10 @ $25.00 = $250.00                                          │
│ - Widget B x 5 @ $40.00 = $200.00                                           │
│                                                                              │
│ Subtotal: $450.00                                                            │
│ Tax (10%): $45.00                                                            │
│ Total: $495.00                                                               │
╰──────────────────────────────────────────────────────────────────────────────╯

Chart/Graph Analysis

praisonai "Analyze trends in this chart and provide insights" --image sales_chart.png

Code Screenshot Analysis

praisonai "Review this code and identify bugs" --image code_screenshot.png

UI/UX Review

praisonai "Provide UX feedback for this interface" --image app_screenshot.png

Object Detection

praisonai "List all objects visible in this image with their positions" --image room.jpg

Expected Output:

╭────────────────────────────────── Response ──────────────────────────────────╮
│ Objects detected:                                                            │
│                                                                              │
│ 1. Sofa (center-left) - Gray fabric, 3-seater                               │
│ 2. Coffee table (center) - Wooden, rectangular                              │
│ 3. TV (right wall) - Mounted, approximately 55"                             │
│ 4. Plant (left corner) - Potted fern                                        │
│ 5. Lamp (right of sofa) - Floor lamp, brass finish                          │
│ 6. Rug (floor, center) - Patterned, blue and white                          │
│ 7. Books (on coffee table) - Stack of 3-4 books                             │
│ 8. Window (background) - Large, with curtains                               │
╰──────────────────────────────────────────────────────────────────────────────╯

Image Path Options

# Local file path
praisonai "Describe" --image ./images/photo.jpg

# Absolute path
praisonai "Describe" --image /Users/name/photos/image.png

# Relative path
praisonai "Describe" --image ../screenshots/screen.png

Best Practices

For best results, use high-resolution images with clear content. Blurry or low-quality images may produce less accurate descriptions.

Image processing uses more tokens than text-only prompts. Use --metrics to monitor costs.

Image Quality

Use clear, well-lit images for best results

Specific Prompts

Be specific about what you want to analyze in the image

File Size

Large images are automatically resized; originals under 20MB recommended

Model Selection

Use GPT-4o or Claude 3 for complex image analysis

CLI

​Quick Start

​Usage

​Basic Image Analysis

​Specify Vision Model

​Combine with Other Features

​Supported Image Formats

​Use Cases

​Document Analysis

​Chart/Graph Analysis

​Code Screenshot Analysis

​UI/UX Review

​Object Detection

​Image Path Options

​Best Practices

Image Quality

Specific Prompts

File Size

Model Selection

​Related

Quick Start

Usage

Basic Image Analysis

Specify Vision Model

Combine with Other Features

Supported Image Formats

Use Cases

Document Analysis

Chart/Graph Analysis

Code Screenshot Analysis

UI/UX Review

Object Detection

Image Path Options

Best Practices

Related