Skip to main content
A workflow optimization pattern where agents handle repetitive tasks through automated loops, processing multiple instances efficiently while maintaining consistency.

Quick Start

1

Install Package

First, install the PraisonAI Agents package:
pip install praisonaiagents
2

Set API Key

Set your OpenAI API key as an environment variable in your terminal:
export OPENAI_API_KEY=your_api_key_here
3

Create a file

Create a new file repetitive_agent.py with the basic setup:
from praisonaiagents import Agent, Workflow, WorkflowContext, StepResult
from praisonaiagents import loop

# Create processor agent
processor = Agent(
    name="Processor",
    role="Task Processor",
    goal="Process each task item thoroughly",
    instructions="Process the given task. Provide a detailed response for each item."
)

# Create summarizer agent
summarizer = Agent(
    name="Summarizer",
    role="Results Summarizer",
    goal="Summarize all processed results",
    instructions="Summarize all the processed results into a final report."
)

# Create workflow with loop - processor handles each item
workflow = Workflow(
    steps=[
        loop(processor, over="topics"),  # Agent processes each topic
        summarizer  # Summarize all results
    ],
    variables={"topics": ["AI Ethics", "Machine Learning", "Neural Networks"]}
)

result = workflow.start("Research and summarize these AI topics")
print(f"Final Summary: {result['output'][:500]}...")
4

Start Workflow

Type this in your terminal to run your workflow:
python repetitive_agent.py
Requirements
  • Python 3.10 or higher
  • OpenAI API key. Generate OpenAI API key here. Use Other models using this guide.

Understanding Repetitive Agents

What are Repetitive Agents?

Repetitive agents enable:
  • Automated task loops
  • Batch processing
  • Consistent task execution
  • Efficient handling of multiple similar tasks

Features

Task Looping

Process multiple tasks through automated loops.

Batch Processing

Handle multiple similar tasks efficiently.

Input Management

Process tasks from structured input files.

Progress Tracking

Monitor task completion and progress.

Loop Tasks with File Processing

Loop tasks can automatically process CSV and text files to create dynamic subtasks. This powerful feature enables batch processing of data without manual task creation.

How File Processing Works

When a loop task has an input_file parameter:
  1. The system reads the CSV file using Python’s csv.reader
  2. Each row becomes a separate subtask
  3. Subtasks inherit properties from the parent loop task
  4. Tasks are executed sequentially with proper context passing

CSV File Format

Loop tasks support multiple CSV formats:

Simple Format

name,issue
John,Billing problem
Jane,Technical issue
Sarah,Password reset

Q&A Format

question,answer
What is 2+2?,4
What is the capital of France?,Paris

Single Column Format

task
Analyze customer feedback
Process refund request
Update user profile

Complete Example: Customer Support System

from praisonaiagents import Workflow, WorkflowContext, StepResult, Agent
from praisonaiagents import loop

# Create a CSV file with customer issues
with open("customers.csv", "w") as f:
    f.write("name,issue\n")
    f.write("John,Billing problem with subscription\n") 
    f.write("Jane,Technical issue with login\n")
    f.write("Sarah,Request for feature enhancement\n")

# Create specialized support agent
support_agent = Agent(
    name="Support Agent",
    role="Customer support specialist",
    goal="Resolve customer issues efficiently",
    llm="gpt-4o-mini"
)

# Process each customer using the agent
def handle_customer(ctx: WorkflowContext) -> StepResult:
    row = ctx.variables.get("item", {})
    name = row.get("name", "unknown")
    issue = row.get("issue", "unknown")
    
    # Use agent to handle the issue
    response = support_agent.chat(f"Help {name} with: {issue}")
    return StepResult(output=f"{name}: {response}")

# Create workflow with loop over CSV
workflow = Workflow(
    steps=[loop(handle_customer, from_csv="customers.csv")]
)

# Start processing
result = workflow.start("Process all customer issues")

# Print results
print("Customer Support Results:")
for output in result["variables"].get("loop_outputs", []):
    print(f"  {output}")

Processing Text Files

Loop workflows can also process text files line by line:
from praisonaiagents import Workflow, WorkflowContext, StepResult, Agent
from praisonaiagents import loop

# Create a text file with URLs
with open("urls.txt", "w") as f:
    f.write("https://example.com\n")
    f.write("https://test.com\n")
    f.write("https://demo.com\n")

# Create URL analyzer agent
url_agent = Agent(
    name="URL Analyzer",
    role="Website analyzer",
    goal="Analyze websites for SEO and performance"
)

# Process each URL
def analyze_url(ctx: WorkflowContext) -> StepResult:
    url = ctx.variables.get("item", "")
    result = url_agent.chat(f"Analyze this website: {url}")
    return StepResult(output=f"{url}: {result}")

# Create workflow with loop
workflow = Workflow(
    steps=[loop(analyze_url, from_file="urls.txt")]
)

result = workflow.start("Analyze all URLs")

Advanced Features

Subtask Inheritance

Subtasks automatically inherit from the parent loop task:
  • Agent assignment
  • Expected output format
  • Callbacks and hooks
  • Task configuration

Context Passing

Each subtask receives:
  • The specific row data
  • Parent task context
  • Previous subtask results (when sequential)

Error Handling

# Loop tasks handle errors gracefully
loop_task = Task(
    name="Process data",
    description="Process each data entry",
    expected_output="Processed result",
    agent=agent,
    task_type="loop",
    input_file="data.csv",
    on_failure="continue"  # Continue processing even if one row fails
)

Best Practices

File Preparation

  • Ensure CSV files are properly formatted
  • Use quotes for fields with commas
  • Handle empty rows appropriately
  • Validate data before processing

Performance

  • Set appropriate max_iter values
  • Consider batch size for large files
  • Monitor memory usage
  • Use efficient agents for repetitive tasks

Common Use Cases

  1. Customer Support: Process support tickets from CSV
  2. Data Analysis: Analyze multiple datasets sequentially
  3. Content Generation: Create content for multiple topics
  4. URL Processing: Analyze or scrape multiple websites
  5. Bulk Operations: Update multiple records or entities

Important Notes

  • Use Workflow class with loop() helper for loop tasks
  • The CSV file must exist before starting the workflow
  • Loop outputs are stored in result["variables"]["loop_outputs"]

Loop Tasks with File Input

Process batches of tasks from CSV or other structured files:
from praisonaiagents import Workflow, WorkflowContext, StepResult, Agent
from praisonaiagents import loop

# Create agent for processing questions
qa_agent = Agent(
    name="QA Bot",
    role="Answer questions",
    goal="Provide accurate answers to user questions"
)

# Process each question using the agent
def answer_question(ctx: WorkflowContext) -> StepResult:
    row = ctx.variables.get("item", {})
    question = row.get("question", "")
    answer = qa_agent.chat(question)
    return StepResult(output=f"Q: {question}\nA: {answer}")

# Create workflow with loop over CSV
workflow = Workflow(
    steps=[loop(answer_question, from_csv="questions.csv")]
)

# Run the batch processing
result = workflow.start("Process all questions")
print(result["variables"]["loop_outputs"])

CSV File Format

The input CSV file should have headers that correspond to task parameters:
question,context,priority
"What is Python?","Programming language context","high"
"Explain machine learning","AI and ML context","medium"
"How does Docker work?","Container technology context","high"

Advanced File Processing

Processing with Multiple Columns

# Agent that uses multiple CSV columns
analyzer = Agent(
    name="Data Analyzer",
    role="Analyze data entries",
    goal="Process and analyze each data entry"
)

# Task that uses multiple columns from CSV
analysis_task = Task(
    name="analyze_entries",
    description="Analyze data: {title} with context: {context}",
    expected_output="Analysis report for each entry",
    agent=analyzer,
    task_type="loop",
    input_file="data_entries.csv",
    # Map CSV columns to task parameters
    column_mapping={
        "title": "title",
        "context": "context",
        "category": "metadata.category"
    }
)

Processing Different File Types

# Define the processor agent
processor = Agent(
    name="DataProcessor",
    role="Data processing specialist",
    goal="Process various data formats efficiently"
)

# JSON file processing
json_task = Task(
    name="process_json_data",
    description="Process JSON entries",
    expected_output="Processed results",
    agent=processor,
    task_type="loop",
    input_file="data.json",
    file_format="json"  # Specify file format
)

# Text file processing (one task per line)
text_task = Task(
    name="process_lines",
    description="Process text: {line}",
    expected_output="Processed line",
    agent=processor,
    task_type="loop",
    input_file="tasks.txt",
    file_format="text"
)

Batch Processing Patterns

Parallel Processing

# Configure workflow with loop for batch processing
from praisonaiagents import Workflow
from praisonaiagents import loop

workflow = Workflow(
    steps=[loop(process_item, from_csv="data.csv")]
)

result = workflow.start("Process all items")

Sequential Processing with Dependencies

# Define the required agents
extractor = Agent(
    name="DataExtractor",
    role="Data extraction specialist",
    goal="Extract data from various sources"
)

transformer = Agent(
    name="DataTransformer",
    role="Data transformation expert",
    goal="Transform data to required format"
)

# First loop task processes data
extract_task = Task(
    name="extract_data",
    description="Extract data from {source}",
    expected_output="Extracted data",
    agent=extractor,
    task_type="loop",
    input_file="sources.csv"
)

# Second loop task uses results from first
transform_task = Task(
    name="transform_data",
    description="Transform extracted data",
    expected_output="Transformed data",
    agent=transformer,
    task_type="loop",
    depends_on=["extract_data"]  # Uses output from extract_data
)

Error Handling and Recovery

# Define processor agent
processor = Agent(
    name="SafeProcessor",
    role="Error-tolerant processor",
    goal="Process items with error recovery"
)

# Configure error handling for batch processing
loop_task = Task(
    name="process_with_recovery",
    description="Process item safely",
    expected_output="Processed result",
    agent=processor,
    task_type="loop",
    input_file="items.csv",
    error_handling={
        "continue_on_error": True,  # Don't stop on errors
        "max_retries": 3,          # Retry failed items
        "log_errors": True         # Log all errors
    }
)

Progress Tracking

Monitor batch processing progress:
from praisonaiagents.callbacks import Callback

class BatchProgressTracker(Callback):
    def __init__(self):
        self.processed = 0
        self.total = 0
        
    def on_task_start(self, task, **kwargs):
        if task.task_type == "loop" and self.total == 0:
            # Count total items
            import csv
            try:
                with open(task.input_file, 'r', encoding='utf-8') as f:
                    # More efficient for counting lines in a CSV
                    self.total = sum(1 for _ in f) - 1
            except FileNotFoundError:
                print(f"Warning: Input file not found at {task.input_file}. Progress will not be shown.")
                self.total = 0
    
    def on_subtask_complete(self, subtask, result, **kwargs):
        self.processed += 1
        print(f"Progress: {self.processed}/{self.total} ({self.processed/self.total*100:.1f}%)")

# Use progress tracker
agents = Agents(
    agents=[qa_agent],
    tasks=[loop_task],
    callbacks=[BatchProgressTracker()]
)

Output Aggregation

Collect and aggregate results from loop tasks:
# Define the summarizer agent
summarizer = Agent(
    name="Summarizer",
    role="Results aggregator",
    goal="Create comprehensive summaries from processed data"
)

# Task that aggregates all loop results
summary_task = Task(
    name="summarize_results",
    description="Create summary of all processed items",
    expected_output="Comprehensive summary report",
    agent=summarizer,
    depends_on=["process_questions"],  # Depends on loop task
    aggregate_results=True  # Receives all loop results
)

# Complete workflow with loop and summary
from praisonaiagents import Workflow, WorkflowContext, StepResult
from praisonaiagents import loop

def summarize_results(ctx: WorkflowContext) -> StepResult:
    outputs = ctx.variables.get("loop_outputs", [])
    summary = summarizer.chat(f"Summarize: {outputs}")
    return StepResult(output=summary)

workflow = Workflow(
    steps=[
        loop(process_item, from_csv="data.csv"),
        summarize_results
    ]
)

Best Practices

  1. File Validation: Always validate input files before processing
import os
import csv

def validate_input_file(filepath):
    if not os.path.exists(filepath):
        raise FileNotFoundError(f"Input file not found: {filepath}")
    
    with open(filepath, 'r') as f:
        reader = csv.reader(f)
        headers = next(reader, None)
        if not headers:
            raise ValueError("CSV file is empty or has no headers")
    
    return True
  1. Memory Management: For large files, use streaming
loop_task = Task(
    name="process_large_file",
    description="Process item",
    expected_output="Result",
    agent=processor,
    task_type="loop",
    input_file="large_data.csv",
    streaming=True,  # Process one item at a time
    chunk_size=100   # Read 100 rows at a time
)
  1. Result Storage: Save results progressively
loop_task = Task(
    name="process_and_save",
    description="Process and save",
    expected_output="Saved result",
    agent=processor,
    task_type="loop",
    input_file="data.csv",
    output_file="results.csv",  # Save results to file
    append_mode=True  # Append results as processed
)

Troubleshooting

Loop Issues

If loops aren’t working as expected:
  • Verify input file format
  • Check task configurations
  • Enable verbose mode for debugging

Performance Issues

If processing is slow:
  • Check batch sizes
  • Verify resource allocation
  • Monitor memory usage

Next Steps

For optimal results, ensure your input files are properly formatted and your task configurations are appropriate for your use case.