> ## Documentation Index
> Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Multi-Agent Media Pipelines

> Chain specialized agents together for complex media processing workflows

# Multi-Agent Media Pipelines

Create powerful media processing workflows by chaining specialized agents (AudioAgent, VideoAgent, ImageAgent, OCRAgent) together with standard agents. Context passes seamlessly between agents using `{{previous_output}}`.

## Overview

Multi-agent pipelines allow you to:

* Chain different agent types in sequence
* Pass context between agents automatically
* Process media through multiple transformation stages
* Combine AI capabilities (transcription → research → generation)

## Example: 5-Agent Media Pipeline

This example demonstrates a complete pipeline: **STT → Research → Image → Video → TTS**

```yaml theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
name: Media Pipeline
description: Complete media pipeline from audio to video
process: sequential

agents:
  # Agent 1: Speech-to-Text
  transcriber:
    agent: AudioAgent
    llm: openai/whisper-1
    role: Audio Transcriber
    goal: Convert audio to text

  # Agent 2: Research (standard Agent with tools)
  researcher:
    role: Research Specialist
    goal: Research the topic
    tools:
      - tavily_search

  # Agent 3: Image Generation
  image_creator:
    agent: ImageAgent
    llm: openai/dall-e-3
    role: Visual Artist
    goal: Create images

  # Agent 4: Video Generation
  video_creator:
    agent: VideoAgent
    llm: openai/sora-2
    role: Video Producer
    goal: Create videos

  # Agent 5: Text-to-Speech (Voiceover)
  narrator:
    agent: AudioAgent
    llm: openai/tts-1-hd
    role: Voice Narrator
    goal: Create voiceovers

steps:
  - agent: transcriber
    action: transcribe
    input: "{{audio_file}}"

  - agent: researcher
    action: "Research based on: {{previous_output}}"

  - agent: image_creator
    action: generate
    prompt: "{{previous_output}}"

  - agent: video_creator
    action: generate
    prompt: "{{previous_output}}"

  - agent: narrator
    action: speech
    text: "{{previous_output}}"
    output: "voiceover.mp3"

variables:
  audio_file: input.mp3
```

## Context Passing

Use `{{previous_output}}` to pass the output from one agent to the next:

```yaml theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
steps:
  - agent: transcriber
    action: transcribe
    input: "audio.mp3"
  
  # The transcription text is available as {{previous_output}}
  - agent: researcher
    action: "Research this topic: {{previous_output}}"
  
  # The research summary is now {{previous_output}}
  - agent: artist
    action: generate
    prompt: "Create an image for: {{previous_output}}"
```

## Mixed Agent Types

Combine specialized agents with standard agents:

```yaml theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
agents:
  # Specialized agent for transcription
  transcriber:
    agent: AudioAgent
    llm: openai/whisper-1
    role: Transcriber
    goal: Transcribe audio

  # Standard agent for analysis
  analyzer:
    role: Content Analyst
    goal: Analyze and summarize content
    instructions: You analyze content and provide insights.

  # Specialized agent for image generation
  visualizer:
    agent: ImageAgent
    llm: openai/dall-e-3
    role: Visualizer
    goal: Create visual representations

steps:
  - agent: transcriber
    action: transcribe
    input: "meeting.mp3"
  
  - agent: analyzer
    action: "Analyze this transcript and identify key themes: {{previous_output}}"
  
  - agent: visualizer
    action: generate
    prompt: "Create an infographic showing: {{previous_output}}"
```

## CLI Usage

Run the multi-agent pipeline recipe:

```bash theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
# Run the complete media pipeline
praisonai recipe run ai-media-pipeline --var audio_file=input.mp3

# With custom output directory
praisonai recipe run ai-media-pipeline --var audio_file=podcast.mp3 --var output_dir=./output
```

## Python API

Create multi-agent pipelines programmatically:

```python theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
from praisonaiagents.workflows.yaml_parser import YAMLWorkflowParser

yaml_content = """
name: Custom Pipeline
process: sequential

agents:
  transcriber:
    agent: AudioAgent
    llm: openai/whisper-1
    role: Transcriber
    goal: Transcribe audio
  
  summarizer:
    role: Summarizer
    goal: Summarize content

steps:
  - agent: transcriber
    action: transcribe
    input: "{{audio_file}}"
  
  - agent: summarizer
    action: "Summarize: {{previous_output}}"

variables:
  audio_file: recording.mp3
"""

parser = YAMLWorkflowParser()
workflow = parser.parse_string(yaml_content)

# Check agent types
for name, agent in parser._agents.items():
    print(f"{name}: {agent.__class__.__name__}")

# Run the workflow
result = workflow.start()
```

## Available Recipes

| Recipe              | Description                 | Agents                                    |
| ------------------- | --------------------------- | ----------------------------------------- |
| `ai-text-to-speech` | Convert text to speech      | AudioAgent                                |
| `ai-speech-to-text` | Transcribe audio            | AudioAgent                                |
| `ai-generate-image` | Generate images             | ImageAgent                                |
| `ai-generate-video` | Generate videos             | VideoAgent                                |
| `ai-document-ocr`   | Extract text from documents | OCRAgent                                  |
| `ai-media-pipeline` | Complete 5-agent pipeline   | AudioAgent, Agent, ImageAgent, VideoAgent |

## Best Practices

1. **Order matters** - Place agents in logical sequence (input → processing → output)
2. **Use appropriate models** - Match model capabilities to task requirements
3. **Handle file outputs** - Ensure output paths are specified for media files
4. **Test incrementally** - Test each agent individually before combining
5. **Monitor context size** - Large outputs may need summarization between steps

## Error Handling

Add error handling with guardrails:

```yaml theme={"theme":{"light":"vitesse-light","dark":"vitesse-dark"}}
steps:
  - agent: transcriber
    action: transcribe
    input: "{{audio_file}}"
    max_retries: 3
  
  - agent: researcher
    action: "Research: {{previous_output}}"
    guardrail: validate_research_output
```

## Related

* [Specialized Agents](/docs/features/specialized-agents) - Individual agent type documentation
* [Workflow Patterns](/docs/features/workflows) - General workflow patterns
* [YAML Workflows](/docs/features/yaml-workflows) - YAML workflow syntax
* [Context Passing](/docs/concepts/context) - How context works between agents
