Skip to main content

Multi-Modal Agent CLI

Work with images, PDFs, and files from the command line.

Commands

Analyze Image

# Analyze image from URL
praisonai-ts image analyze https://example.com/image.jpg \
  --prompt "What do you see?"

# Analyze local image
praisonai-ts image analyze ./photo.png \
  --prompt "Describe this image in detail"

# With specific model
praisonai-ts image analyze ./image.jpg \
  --model gpt-4o \
  --prompt "What objects are in this image?"

Generate Image

# Generate image with DALL-E
praisonai-ts image generate "A sunset over mountains" \
  --model dall-e-3 \
  --size 1024x1024 \
  --output ./sunset.png

# With quality setting
praisonai-ts image generate "Futuristic city" \
  --quality hd \
  --style vivid

Process PDF

# Summarize PDF
praisonai-ts pdf summarize ./document.pdf

# Extract text
praisonai-ts pdf extract ./document.pdf --output text.txt

# Ask questions about PDF
praisonai-ts pdf query ./document.pdf \
  --prompt "What are the main findings?"

Options

OptionTypeDefaultDescription
--modelstringgpt-4oModel to use
--promptstring-Analysis prompt
--outputstring-Output file path
--sizestring1024x1024Image size
--qualitystringstandardImage quality
--jsonbooleanfalseJSON output

Examples

Batch Image Analysis

# Analyze multiple images
praisonai-ts image analyze ./images/*.jpg \
  --prompt "Categorize this image" \
  --output results.json \
  --json

Compare Images

# Compare two images
praisonai-ts image compare ./image1.jpg ./image2.jpg \
  --prompt "What are the differences?"

Interactive Vision Chat

# Start vision chat session
praisonai-ts chat --vision \
  --model gpt-4o \
  --instructions "You are a helpful image analyst"

Environment Variables

VariableRequiredDescription
OPENAI_API_KEYYesFor GPT-4o and DALL-E
ANTHROPIC_API_KEYFor ClaudeClaude vision
  • praisonai-ts image list-models - List vision models
  • praisonai-ts image history - View generation history