Recipe Serve Advanced CLI
Advanced CLI options for the recipe server including rate limiting, metrics, admin endpoints, workers, and OpenTelemetry tracing.Quick Reference
Command Options
| Option | Description | Default |
|---|---|---|
--workers <num> | Number of worker processes | 1 |
--rate_limit <num> | Requests per minute per client | disabled |
--max_request_size <bytes> | Maximum request body size | 10485760 (10MB) |
--enable_metrics | Enable /metrics endpoint | false |
--enable_admin | Enable /admin/* endpoints | false |
--trace_exporter <type> | Tracing: none, otlp, jaeger, zipkin | none |
Rate Limiting
Protect your server from abuse.Test Rate Limiting
Request Size Limits
Prevent oversized payloads.Test Size Limit
Metrics Endpoint
Expose Prometheus metrics.Sample Output
Prometheus Integration
Admin Endpoints
Hot-reload recipes without restart.Response
Workers
Scale with multiple processes.Notes
- Workers > 1 automatically disables
--reload - Each worker has independent rate limiter state
- For distributed rate limiting, use external store (Redis)
OpenTelemetry Tracing
Distributed tracing support.Install Dependencies
OpenAPI Specification
Get the API specification.Configuration File
All CLI options can be set inserve.yaml:
Production Examples
Basic Production
Full Production
Docker
Kubernetes
Environment Variables
| Variable | Description |
|---|---|
PRAISONAI_API_KEY | API key for authentication |
PRAISONAI_SERVE_HOST | Default host |
PRAISONAI_SERVE_PORT | Default port |
OTEL_EXPORTER_OTLP_ENDPOINT | OTLP collector endpoint |
Troubleshooting
Rate Limit Not Working
Check if path is exempt:Metrics Endpoint 404
Enable metrics:Admin Endpoint 401
Provide authentication:Workers with Reload
Cannot use both:Next Steps
- See Python Usage for programmatic configuration
- Review Recipe Serve Basics
- Explore Endpoints CLI

