ImageAgent
Overview
Basic Usage
Configuration Options
Core Parameters
Inherited Parameters
Supported Models
Advanced Features
Custom Styling
Async Image Generation
Integration with Other Agents
Error Handling
Best Practices
Limitations
See Also

ImageAgent

The ImageAgent class is a specialized agent designed for AI image generation tasks. It extends the base Agent class and provides seamless integration with various image generation models and providers.

Overview

ImageAgent simplifies the process of generating images using AI models by providing a unified interface for multiple image generation providers including OpenAI DALL-E, Stability AI, and other compatible models through the litellm library.

Basic Usage

from praisonaiagents import ImageAgent

# Create an image generation agent
artist = ImageAgent(
    role="Digital Artist",
    goal="Create stunning AI-generated artwork based on prompts",
    backstory="You are a creative AI artist capable of generating beautiful images.",
    llm="dall-e-3"  # or other supported models
)

# Generate an image
result = artist.chat("Create a serene landscape with mountains at sunset")
print(result)  # Returns the generated image URL or base64 data

Configuration Options

Core Parameters

llm (str): The image generation model to use (e.g., “dall-e-3”, “dall-e-2”, “stable-diffusion-v1-6-0”)
api_key (str, optional): API key for the image generation service
style (str, optional): Style of the generated image (default: “natural”). Note: This is passed to the underlying generation config
response_format (str, optional): Format for the response (“url” or “b64_json”, default: “url”)
timeout (int, optional): Timeout for image generation requests in seconds (default: 600)
api_version (str, optional): Optional API version (required for Azure dall-e-3)

Inherited Parameters

All parameters from the base Agent class are also available:

role, goal, backstory, instructions
tools, llm, max_iter, max_retry
verbose, cache, markdown

Supported Models

ImageAgent supports various image generation models through litellm:

OpenAI: dall-e-3, dall-e-2
Stability AI: stable-diffusion-v1-6-0, stable-diffusion-xl-1024-v1-0
Other providers: Any model supported by litellm’s image generation capabilities

Advanced Features

Custom Styling

# Create an agent with specific style preferences
artist = ImageAgent(
    role="Style-specific Artist",
    goal="Generate images in specific artistic styles",
    llm="dall-e-3",
    style="vivid",  # or "natural" for DALL-E 3
    response_format="url"
)

Async Image Generation

import asyncio

artist = ImageAgent(
    role="Async Artist",
    goal="Generate images asynchronously",
    llm="dall-e-3"
)

async def generate_images():
    # Generate multiple images concurrently
    tasks = [
        artist.achat("A futuristic city"),
        artist.achat("An underwater palace"),
        artist.achat("A magical forest")
    ]
    results = await asyncio.gather(*tasks)
    return results

# Run async generation
images = asyncio.run(generate_images())

Integration with Other Agents

from praisonaiagents import Agent, ImageAgent

# Create a writer agent
writer = Agent(
    role="Creative Writer",
    goal="Write creative descriptions for image generation",
    backstory="You excel at crafting detailed, imaginative prompts."
)

# Create an image agent
artist = ImageAgent(
    role="AI Artist",
    goal="Transform written descriptions into stunning visuals",
    llm="dall-e-3"
)

# Writer creates a prompt
prompt = writer.chat("Write a detailed description of a dreamlike fantasy landscape")

# Artist generates the image
image = artist.chat(prompt)

Error Handling

from praisonaiagents import ImageAgent

artist = ImageAgent(
    llm="dall-e-3",
    max_retry=3,  # Retry failed generations
    timeout=60    # 60 second timeout
)

try:
    result = artist.chat("Generate an image")
except Exception as e:
    print(f"Image generation failed: {e}")

Best Practices

Model Selection: Choose the appropriate model based on your needs:
- dall-e-3: Best quality and prompt adherence
- dall-e-2: Faster and more cost-effective
- Stable Diffusion variants: Open-source alternatives

Prompt Engineering: Provide detailed, descriptive prompts for better results:

# Good prompt
"A serene Japanese garden with cherry blossoms, koi pond, and traditional bridge at sunset"

# Less effective prompt
"A garden"

Response Format: Use URL format for web applications, base64 for direct processing:

# For web display
artist = ImageAgent(llm="dall-e-3", response_format="url")

# For image processing
artist = ImageAgent(llm="dall-e-3", response_format="b64_json")

API Key Management: Store API keys securely using environment variables:

import os

artist = ImageAgent(
    llm="dall-e-3",
    api_key=os.getenv("OPENAI_API_KEY")
)

Limitations

Image generation can be slow (10-30 seconds depending on the model)
Some models have content policy restrictions
Generation costs vary by model and provider
Output resolution depends on the model capabilities

Getting Started

Core Concepts

Workflows

Features

Models

Tools

Other Features

Monitoring

Developers

Configuration

Best Practices

Getting Started (No Code)

API Reference

ImageAgent

ImageAgent

Overview

Basic Usage

Configuration Options

Core Parameters

Inherited Parameters

Supported Models

Advanced Features

Custom Styling

Async Image Generation

Integration with Other Agents

Error Handling

Best Practices

Limitations

See Also

Getting Started

Core Concepts

Workflows

Features

Models

Tools

Other Features

Monitoring

Developers

Configuration

Best Practices

Getting Started (No Code)

API Reference

​ImageAgent

​Overview

​Basic Usage

​Configuration Options

​Core Parameters

​Inherited Parameters

​Supported Models

​Advanced Features

​Custom Styling

​Async Image Generation

​Integration with Other Agents

​Error Handling

​Best Practices

​Limitations

​See Also

ImageAgent

Overview

Basic Usage

Configuration Options

Core Parameters

Inherited Parameters

Supported Models

Advanced Features

Custom Styling

Async Image Generation

Integration with Other Agents

Error Handling

Best Practices

Limitations

See Also