Set your OpenAI API key as an environment variable in your terminal:
Copy
export OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxxx
3
Create a file
Create a new file app.py with the basic setup:
Copy
from praisonaiagents import Agent, Task, Agents# Create Vision Analysis Agentvision_agent = Agent( name="VisionAnalyst", role="Computer Vision Specialist", goal="Analyze images and videos to extract meaningful information", backstory="""You are an expert in computer vision and image analysis. You excel at describing images, detecting objects, and understanding visual content.""", llm="gpt-4o-mini", reflection=False)# Create tasks with different media typestask = Task( name="analyze_landmark", description="Describe this famous landmark and its architectural features.", expected_output="Detailed description of the landmark's architecture and significance", agent=vision_agent, images=["https://upload.wikimedia.org/wikipedia/commons/b/bf/Krakow_-_Kosciol_Mariacki.jpg"])# Run the agentsagents = Agents( agents=[vision_agent], tasks=[task], process="sequential",)agents.start()
4
Start Agents
Type this in your terminal to run your agents:
Copy
python app.py
1
Install Package
Install the PraisonAI package:
Copy
pip install praisonai opencv-python moviepy
2
Set API Key
Set your OpenAI API key as an environment variable in your terminal:
Copy
export OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxxxx
3
Create a file
Create a new file agents.yaml with the basic setup:
Copy
framework: praisonaiprocess: sequentialtopic: analyze landmark imageagents: # Canonical: use 'agents' instead of 'roles' vision_analyst: name: VisionAnalyst role: Computer Vision Specialist goal: Analyze images and videos to extract meaningful information instructions: # Canonical: use 'instructions' instead of 'backstory' | You are an expert in computer vision and image analysis. You excel at describing images, detecting objects, and understanding visual content. llm: gpt-4o-mini self_reflect: false tasks: analyze_landmark: description: Describe this famous landmark and its architectural features. expected_output: Detailed description of the landmark's architecture and significance images: - https://upload.wikimedia.org/wikipedia/commons/b/bf/Krakow_-_Kosciol_Mariacki.jpg
from praisonaiagents import Agent, Task, Agents# Create first agent for image analysisvision_agent = Agent( role="Image Analyst", goal="Analyze visual content and extract key information", backstory="Expert in visual analysis and image understanding", llm="gpt-4o-mini", reflection=False)# Create second agent for content writingwriter_agent = Agent( role="Content Writer", goal="Create engaging content based on image analysis", backstory="Expert in creating compelling content from visual insights", llm="gpt-4o-mini")# Create tasks for different media typesdocument_task = Task( description="Extract and summarize text from this document image", expected_output="Structured text content with key information highlighted", agent=vision_agent, images=["document.jpg"])writing_task = Task( description="Create engaging content based on image analysis", expected_output="Compelling article incorporating visual insights", agent=writer_agent)# Create and start the agentsagents = Agents( agents=[vision_agent, writer_agent], tasks=[document_task, writing_task], process="sequential")result = agents.start()
Copy
framework: praisonaiprocess: sequentialtopic: document analysis and content creationagents: # Canonical vision_analyst: role: Image Analyst goal: Analyze visual content and extract key information instructions: # Canonical: use 'instructions' instead of 'backstory' Expert in visual analysis and image understanding llm: gpt-4o-mini self_reflect: false tasks: document_task: description: Extract and summarize text from this document image expected_output: Structured text content with key information highlighted images: - document.jpg content_writer: role: Content Writer goal: Create engaging content based on image analysis instructions: # Canonical: use 'instructions' instead of 'backstory' Expert in creating compelling content from visual insights llm: gpt-4o-mini tasks: writing_task: description: Create engaging content based on image analysis expected_output: Compelling article incorporating visual insights
Send images to the agent for analysis without storing them in chat history. Essential for preventing context window overflow when processing multiple images.
Attachments Parameter
Ephemeral Context Manager
Copy
from praisonaiagents import Agentagent = Agent( instructions="You analyze images and remember context", memory=True)# Image is analyzed but NOT stored in historyresponse = agent.chat( prompt="What's in this image?", # ← Stored in history attachments=["photo.jpg"], # ← NOT stored (ephemeral))# Agent remembers the question, not the image dataresponse = agent.chat("What did I ask about earlier?")# Agent: "You asked 'What's in this image?' and I told you..."
Copy
from praisonaiagents import Agentagent = Agent(instructions="Analyze images", memory=True)# Pre-image conversationagent.chat("Hello, I have some photos to show you")# Ephemeral block - nothing stored permanentlywith agent.ephemeral(): response = agent.chat( "Analyze this", attachments=["image1.jpg"] ) followup = agent.chat("What about the colors?")# After block, history is restored - images NOT persistedagent.chat("What have we discussed?") # Only remembers pre-image chat
Clean up chat history after image analysis sessions:
Copy
from praisonaiagents import Agentagent = Agent(instructions="Image analyst", memory=True)# After image analysis, clean up historyagent.prune_history(keep_last=5) # Keep only last 5 messagesagent.delete_history(-1) # Delete last messageagent.delete_history_matching("[IMAGE]") # Delete all image-related messages# Check history sizeprint(f"History size: {agent.get_history_size()}")
Method
Description
prune_history(keep_last=N)
Keep only last N messages
delete_history(index)
Delete message by index
delete_history_matching(pattern)
Delete messages containing pattern
get_history_size()
Get current history length
ephemeral()
Context manager for temporary conversations
Use attachments= for one-time image analysis, or ephemeral() for multi-turn image conversations that shouldn’t persist.