VisionAgent
Defined in the Vision Agent module.AI Agent A specialized agent for image analysis and understanding. Provides:
- Image analysis and description
- Multi-image comparison
- Text extraction from images
- OpenAI:
gpt-4o,gpt-4o-mini,gpt-4-turbo - Anthropic:
claude-3-5-sonnet-20241022,claude-3-opus-20240229 - Google:
gemini/gemini-1.5-pro,gemini/gemini-1.5-flash
Constructor
No description available.
No description available.
No description available.
No description available.
No description available.
No description available.
No description available.
No description available.
Methods
console()
Lazily initialize Rich Console.
litellm()
Lazy load litellm module when needed.
analyze()
Analyze an image and return analysis.
describe()
Generate a detailed description of an image.
compare()
Compare multiple images.
extract_text()
Extract text from an image (OCR-like functionality).
aanalyze()
Async version of analyze().
adescribe()
Async version of describe().
acompare()
Async version of compare().
aextract_text()
Async version of extract_text().
Usage
Source
View on GitHub
praisonaiagents/agent/vision_agent.py at line 48
