Bot Rate Limiting - PraisonAI

Bot Rate Limiting prevents messaging platform 429 errors by throttling outbound messages to channels, respecting platform-specific limits and per-channel delays.

Quick Start

Simple Usage

Use the default rate limiter for general messaging bots.

from praisonai.bots._rate_limit import RateLimiter

# Default: 1 message/sec, burst 5, 1s per-channel delay
limiter = RateLimiter()

# Bot adapters call this before sending
await limiter.acquire(channel_id="telegram-chat-12345")

Platform-Specific Configuration

Use platform presets for optimal compliance with API limits.

from praisonai.bots._rate_limit import RateLimiter

# Telegram: 25 msg/sec, 30 burst, 0.05s per-channel
telegram_limiter = RateLimiter.for_platform("telegram")

# Discord: 1 msg/sec, 5 burst, 1s per-channel  
discord_limiter = RateLimiter.for_platform("discord")

# Slack: 1 msg/sec, 1 burst, 1s per-channel
slack_limiter = RateLimiter.for_platform("slack")

# WhatsApp: 50 msg/sec, 80 burst, 0.1s per-channel
whatsapp_limiter = RateLimiter.for_platform("whatsapp")

Custom Configuration

Fine-tune rate limits for specific platform policies or custom requirements.

from praisonai.bots._rate_limit import RateLimiter, RateLimitConfig

limiter = RateLimiter(RateLimitConfig(
    messages_per_second=2.0,  # Global rate
    burst_size=10,            # Burst capacity
    per_channel_delay=1.5,    # Min delay per channel
))

await limiter.acquire(channel_id="custom-channel-456")

How It Works

The rate limiter uses a two-phase approach:

Phase	Description	Benefits
Reserve	Under global lock: check tokens, reserve capacity, update channel tracking	Thread-safe bookkeeping
Sleep	Outside lock: actual delay based on computed wait time	Multiple channels can sleep concurrently

Configuration Options

Option	Type	Default	Description
`messages_per_second`	`float`	`1.0`	Token refill rate for the global token bucket.
`burst_size`	`int`	`5`	Max tokens that can accumulate (burst capacity).
`per_channel_delay`	`float`	`1.0`	Minimum seconds between two sends to the same channel.

Platform Presets

Platform	messages_per_second	burst_size	per_channel_delay	Notes
Telegram	25.0	30	0.05	~30 msg/sec to different users
Discord	1.0	5	1.0	5 messages per 5 seconds per channel
Slack	1.0	1	1.0	1 message per second per channel
WhatsApp	50.0	80	0.1	~80 msg/sec Cloud API limit

Memory Management

Per-channel state is tracked in an LRU cache capped at 4096 channels. If a bot serves more channels than that, the least-recently-used channels fall out of the cache and their per_channel_delay window resets. This bounds memory for long-running bots.

# Memory usage stays bounded even with many channels
for channel_id in range(10000):  # 10k channels
    await limiter.acquire(f"channel-{channel_id}")
    # Internal cache automatically evicts old entries at 4096 limit

Concurrency Design

The global lock is held only long enough to reserve a token and update bookkeeping — the actual sleep happens outside the lock. Multiple channels can be rate-limited concurrently without serialising on one mutex. Before PR #1870 (serialized):

# Old behavior: sleep INSIDE lock - channels wait in line
async with self._lock:
    # check tokens, check channel timing
    await asyncio.sleep(delay)  # BLOCKS other channels

After PR #1870 (concurrent):

# New behavior: sleep OUTSIDE lock - channels sleep in parallel
async with self._lock:
    # check tokens, check channel timing, compute delay
    pass  # lock released immediately
await asyncio.sleep(delay)  # Multiple channels sleep concurrently

Best Practices

Use Platform Presets

Start with RateLimiter.for_platform() instead of custom configs. Platform presets are tuned for each API’s documented limits and real-world behavior.

# Good: use tested platform preset
limiter = RateLimiter.for_platform("discord")

# Risky: custom config might hit undocumented limits
limiter = RateLimiter(RateLimitConfig(messages_per_second=10.0))

Share Limiters Across Bot Instances

Create one rate limiter per platform and share it across all bot instances to respect global rate limits.

# Good: shared limiter
telegram_limiter = RateLimiter.for_platform("telegram")

bot1 = TelegramBot(rate_limiter=telegram_limiter)
bot2 = TelegramBot(rate_limiter=telegram_limiter)

# Bad: separate limiters bypass global limits
bot1 = TelegramBot(rate_limiter=RateLimiter.for_platform("telegram"))
bot2 = TelegramBot(rate_limiter=RateLimiter.for_platform("telegram"))

Monitor Rate Limit Logs

The rate limiter logs debug messages when applying delays. Monitor these to tune your configuration.

import logging
logging.getLogger("praisonai.bots._rate_limit").setLevel(logging.DEBUG)

# Log output:
# DEBUG:praisonai.bots._rate_limit:Rate limit: waiting 0.750s for channel telegram-chat-123

Handle Platform-Specific Burst Patterns

Some platforms allow bursts followed by longer delays. The burst_size parameter accommodates this pattern.

# WhatsApp allows rapid bursts then enforces stricter limits
whatsapp_limiter = RateLimiter(RateLimitConfig(
    messages_per_second=10.0,  # Sustained rate
    burst_size=50,             # Initial burst capacity
    per_channel_delay=0.1      # Quick per-channel recovery
))

Rate Limiter (LLM)

Rate limiting for LLM API calls (different from bot message rate limiting)

Messaging Bots

Build bots for Telegram, Discord, Slack, and WhatsApp platforms

​Quick Start

​How It Works

​Configuration Options

​Platform Presets

​Memory Management

​Concurrency Design

​Best Practices

​Related