Remote Agent Deployment
Deploy your PraisonAI agents as HTTP services to enable distributed architectures, scalable deployments, and remote connectivity.
Overview
PraisonAI supports deploying agents as HTTP services that can be accessed remotely. This enables:
- Distributed agent architectures
- Scalable microservice deployments
- Cross-network agent communication
- Load balancing and failover
- API gateway integration
Basic Agent Deployment
Single Agent Server
Deploy an agent as an HTTP service:
from praisonaiagents import Agent
# Create an agent
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant.",
llm="gpt-4o-mini"
)
# Launch as HTTP server
agent.launch(
path="/agent",
port=8000,
host="0.0.0.0" # Accept connections from any IP
)
# Agent is now accessible at http://localhost:8000/agent
Multiple Agents on Different Endpoints
from praisonaiagents import Agent
# Create specialised agents
math_agent = Agent(
name="MathExpert",
instructions="You solve mathematical problems"
)
code_agent = Agent(
name="CodeExpert",
instructions="You help with programming"
)
# Deploy on different endpoints
math_agent.launch(path="/math", port=8000)
code_agent.launch(path="/code", port=8001)
Remote Agent Connectivity
Connecting to Remote Agents
Use the Session class to connect to remote agents:
from praisonaiagents import Session
# Connect to remote agent
session = Session(agent_url="192.168.1.10:8000/agent")
# Interact with remote agent
response = session.chat("Hello, how are you?")
print(response)
# Send structured messages
result = session.send_message({
"content": "Analyse this data",
"data": {"sales": [100, 200, 300]}
})
Error Handling
try:
session = Session(agent_url="remote-server:8000/agent")
response = session.chat("Process this request")
except ConnectionError as e:
print(f"Failed to connect to remote agent: {e}")
# Fallback to local agent or retry
except TimeoutError as e:
print(f"Request timed out: {e}")
Advanced Deployment Patterns
Multi-Agent Server
Deploy multiple agents together:
from praisonaiagents import Agent, Agents
# Create agents
agents = Agents(
agents=[
Agent(name="Research", instructions="Research topics"),
Agent(name="Writer", instructions="Write content"),
Agent(name="Editor", instructions="Edit and improve text")
]
)
# Launch all agents
agents.launch(port=8080)
# Agents accessible at:
# http://localhost:8080/research
# http://localhost:8080/writer
# http://localhost:8080/editor
MCP Protocol Support
Deploy agents with Model Context Protocol:
# Single agent with MCP
agent = Agent(
name="DataAnalyst",
instructions="Analyse data and create visualisations"
)
agent.launch(
port=8080,
protocol="mcp" # Enable MCP protocol
)
# Multi-agent MCP server
agents = Agents(agents=[agent1, agent2, agent3])
agents.launch(port=8080, protocol="mcp")
FastAPI Integration
Integrate agents with existing FastAPI applications:
from fastapi import FastAPI, HTTPException
from praisonaiagents import Agent
from pydantic import BaseModel
app = FastAPI()
# Create agent
agent = Agent(
name="APIAssistant",
instructions="Process API requests"
)
class ChatRequest(BaseModel):
message: str
context: dict = {}
class ChatResponse(BaseModel):
response: str
metadata: dict = {}
@app.post("/chat", response_model=ChatResponse)
async def chat_endpoint(request: ChatRequest):
try:
# Process with agent
response = agent.chat(
request.message,
context=request.context
)
return ChatResponse(
response=response,
metadata={"agent": agent.name}
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
# Additional endpoints
@app.get("/health")
async def health_check():
return {"status": "healthy", "agent": agent.name}
# Run with: uvicorn main:app --host 0.0.0.0 --port 8000
Production Deployment
Docker Deployment
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy agent code
COPY agent.py .
# Expose port
EXPOSE 8000
# Run agent server
CMD ["python", "agent.py"]
# agent.py
from praisonaiagents import Agent
import os
agent = Agent(
name="ProductionAgent",
instructions=os.getenv("AGENT_INSTRUCTIONS", "Default instructions"),
llm=os.getenv("LLM_MODEL", "gpt-4o-mini")
)
agent.launch(
port=int(os.getenv("PORT", 8000)),
host="0.0.0.0"
)
Load Balancing
Deploy multiple instances behind a load balancer:
# docker-compose.yml
version: '3.8'
services:
agent1:
build: .
environment:
- AGENT_INSTRUCTIONS=Process customer queries
- PORT=8001
agent2:
build: .
environment:
- AGENT_INSTRUCTIONS=Process customer queries
- PORT=8002
nginx:
image: nginx
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- agent1
- agent2
Security Considerations
Authentication
Add authentication to your agent endpoints:
from fastapi import Depends, HTTPException, Security
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
security = HTTPBearer()
async def verify_token(credentials: HTTPAuthorizationCredentials = Security(security)):
token = credentials.credentials
# Verify token logic
if not is_valid_token(token):
raise HTTPException(status_code=403, detail="Invalid token")
return token
@app.post("/chat", dependencies=[Depends(verify_token)])
async def secure_chat(request: ChatRequest):
# Process request with agent
pass
HTTPS/TLS
Deploy with SSL certificates:
agent.launch(
port=8443,
host="0.0.0.0",
ssl_keyfile="key.pem",
ssl_certfile="cert.pem"
)
Rate Limiting
Implement rate limiting for production:
from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter
@app.post("/chat", dependencies=[Depends(RateLimiter(times=10, seconds=60))])
async def rate_limited_chat(request: ChatRequest):
# Process with rate limiting
pass
Monitoring and Observability
Health Checks
@app.get("/health")
async def health():
return {
"status": "healthy",
"agent": agent.name,
"uptime": get_uptime(),
"memory_usage": get_memory_usage()
}
@app.get("/ready")
async def readiness():
# Check if agent is ready to serve
try:
test_response = agent.chat("test")
return {"ready": True}
except Exception:
return {"ready": False}, 503
Metrics Collection
from prometheus_client import Counter, Histogram, generate_latest
# Metrics
request_count = Counter('agent_requests_total', 'Total requests')
request_duration = Histogram('agent_request_duration_seconds', 'Request duration')
@app.post("/chat")
async def chat_with_metrics(request: ChatRequest):
request_count.inc()
with request_duration.time():
response = agent.chat(request.message)
return {"response": response}
@app.get("/metrics")
async def metrics():
return Response(generate_latest(), media_type="text/plain")
Common Patterns
API Gateway Pattern
# Deploy specialised agents
agents = {
"billing": Agent(name="Billing", instructions="Handle billing"),
"support": Agent(name="Support", instructions="Handle support"),
"sales": Agent(name="Sales", instructions="Handle sales")
}
for name, agent in agents.items():
agent.launch(path=f"/{name}", port=8000 + len(agents))
# Use API gateway to route requests
Circuit Breaker Pattern
from circuit_breaker import CircuitBreaker
cb = CircuitBreaker(failure_threshold=5, recovery_timeout=30)
@cb
async def call_remote_agent(message):
session = Session(agent_url="remote:8000/agent")
return session.chat(message)
# Automatically handles failures and fallbacks
Failover Pattern
primary_url = "primary-server:8000/agent"
backup_url = "backup-server:8000/agent"
try:
session = Session(agent_url=primary_url)
response = session.chat(message)
except Exception:
# Failover to backup
session = Session(agent_url=backup_url)
response = session.chat(message)
Best Practices
- Use Environment Variables: Configure agents via environment variables for flexibility
- Implement Health Checks: Always include health and readiness endpoints
- Add Monitoring: Use metrics and logging for observability
- Secure Endpoints: Implement authentication and rate limiting
- Handle Errors Gracefully: Provide meaningful error responses
- Document APIs: Use OpenAPI/Swagger for API documentation
- Version Your APIs: Include version in paths (e.g.,
/v1/agent
)