The --image flag enables image processing with vision-capable AI models.
Quick Start
praisonai "Describe this image" --image path/to/image.png
Usage
Basic Image Analysis
praisonai "What's in this photo?" --image photo.jpg
Expected Output:
🖼️ Processing image: photo.jpg
╭─ Agent Info ─────────────────────────────────────────────────────────────────╮
│ 👤 Agent: ImageAgent │
│ Role: Vision Assistant │
╰──────────────────────────────────────────────────────────────────────────────╯
╭────────────────────────────────── Response ──────────────────────────────────╮
│ The image shows a golden retriever dog sitting on a grassy lawn. The dog │
│ appears to be smiling with its tongue out. In the background, there's a │
│ wooden fence and some trees. The lighting suggests it was taken during │
│ late afternoon, creating a warm, golden atmosphere. │
╰──────────────────────────────────────────────────────────────────────────────╯
Specify Vision Model
# Use GPT-4o for vision
praisonai "Analyze this chart" --image chart.png --llm openai/gpt-4o
# Use Claude for vision
praisonai "Describe the scene" --image scene.jpg --llm anthropic/claude-3-sonnet-20240229
Combine with Other Features
# Image analysis with metrics
praisonai "Count objects" --image warehouse.jpg --metrics
# Image with guardrail
praisonai "Extract text from image" --image document.png --guardrail "Output as JSON"
# Image with save
praisonai "Describe artwork" --image painting.jpg --save
| Format | Extension | Support |
|---|
| JPEG | .jpg, .jpeg | ✅ Full |
| PNG | .png | ✅ Full |
| GIF | .gif | ✅ Static frame |
| WebP | .webp | ✅ Full |
| BMP | .bmp | ✅ Full |
Use Cases
Document Analysis
praisonai "Extract all text from this document" --image invoice.png
Expected Output:
╭────────────────────────────────── Response ──────────────────────────────────╮
│ Invoice #: INV-2024-001 │
│ Date: December 16, 2024 │
│ Customer: Acme Corp │
│ │
│ Items: │
│ - Widget A x 10 @ $25.00 = $250.00 │
│ - Widget B x 5 @ $40.00 = $200.00 │
│ │
│ Subtotal: $450.00 │
│ Tax (10%): $45.00 │
│ Total: $495.00 │
╰──────────────────────────────────────────────────────────────────────────────╯
Chart/Graph Analysis
praisonai "Analyze trends in this chart and provide insights" --image sales_chart.png
Code Screenshot Analysis
praisonai "Review this code and identify bugs" --image code_screenshot.png
UI/UX Review
praisonai "Provide UX feedback for this interface" --image app_screenshot.png
Object Detection
praisonai "List all objects visible in this image with their positions" --image room.jpg
Expected Output:
╭────────────────────────────────── Response ──────────────────────────────────╮
│ Objects detected: │
│ │
│ 1. Sofa (center-left) - Gray fabric, 3-seater │
│ 2. Coffee table (center) - Wooden, rectangular │
│ 3. TV (right wall) - Mounted, approximately 55" │
│ 4. Plant (left corner) - Potted fern │
│ 5. Lamp (right of sofa) - Floor lamp, brass finish │
│ 6. Rug (floor, center) - Patterned, blue and white │
│ 7. Books (on coffee table) - Stack of 3-4 books │
│ 8. Window (background) - Large, with curtains │
╰──────────────────────────────────────────────────────────────────────────────╯
Image Path Options
# Local file path
praisonai "Describe" --image ./images/photo.jpg
# Absolute path
praisonai "Describe" --image /Users/name/photos/image.png
# Relative path
praisonai "Describe" --image ../screenshots/screen.png
Best Practices
For best results, use high-resolution images with clear content. Blurry or low-quality images may produce less accurate descriptions.
Image processing uses more tokens than text-only prompts. Use --metrics to monitor costs.
Image Quality
Use clear, well-lit images for best results
Specific Prompts
Be specific about what you want to analyze in the image
File Size
Large images are automatically resized; originals under 20MB recommended
Model Selection
Use GPT-4o or Claude 3 for complex image analysis