Knowledge Backends
PraisonAI supports multiple knowledge storage backends through a protocol-driven architecture. This allows you to choose the best backend for your use case while maintaining a consistent API.
Available Backends
| Backend | Description | Best For |
|---|
| mem0 (default) | Long-term memory with semantic search | Multi-user apps, persistent memory |
| chroma | Local vector database | Development, single-user apps |
| internal | Built-in lightweight storage | Simple use cases |
Agent-First Usage
The recommended way to use knowledge is through the Agent API:
from praisonaiagents import Agent
# Create agent with knowledge (uses mem0 by default)
agent = Agent(
name="ResearchAssistant",
instructions="You are a research assistant.",
knowledge=["./documents/"], # Add documents
memory={"user_id": "user123"} , # Required for mem0 backend
)
# Chat automatically retrieves relevant context
response = agent.chat("What are the main findings?")
Scope Identifiers
Knowledge backends support three scope identifiers for multi-tenant isolation:
| Identifier | Purpose | Example |
|---|
user_id | Isolate per user | "user_alice" |
agent_id | Isolate per agent type | "research_agent_v1" |
run_id | Isolate per session | "session_abc123" |
The mem0 backend requires at least one scope identifier. If none is provided, operations will fail with a ScopeRequiredError.
Example with Scope
from praisonaiagents import Agent
# User-scoped knowledge
agent = Agent(
name="PersonalAssistant",
instructions="You are a personal assistant.",
knowledge=["./user_docs/"],
memory={"user_id": "alice"} , # Knowledge scoped to Alice
)
# Agent-scoped knowledge (shared across users)
shared_agent = Agent(
name="CompanyBot",
instructions="You answer company policy questions.",
knowledge=["./policies/"],
agent_id="company_bot_v1", # Shared knowledge
)
Combining Multiple Scopes
Combine user_id, agent_id, and run_id to isolate knowledge down to a specific session for a specific agent and user.
from praisonaiagents import Agent
agent = Agent(
name="SupportBot",
instructions="Answer using the customer's session history.",
knowledge=["./support_docs/"],
user_id="customer_42",
agent_id="support_bot_v1",
run_id="session_2026_05_30",
)
agent.start("What did we discuss about my refund?")
You can also use the direct API for more control:
results = knowledge.search(
"refund discussion",
user_id="customer_42",
agent_id="support_bot_v1",
run_id="session_2026_05_30",
)
When you pass more than one scope identifier, PraisonAI automatically combines them using ChromaDB’s $and operator. A single identifier is passed through unchanged. You don’t need to write the $and yourself.
All provided identifiers are required to match (logical AND). Omit an identifier to broaden the scope on that dimension.
Multi-tenant SaaS application flow:
- Per-customer isolation → set
user_id
- Per-agent isolation (e.g. SupportBot vs. SalesBot share infra but not data) → also set
agent_id
- Per-conversation isolation (e.g. ephemeral session memory) → also set
run_id
Direct Knowledge API
For advanced use cases, you can use the Knowledge class directly:
from praisonaiagents.knowledge import Knowledge
# Initialize with config
knowledge = Knowledge(config={
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "my_docs",
"path": "./.praison/knowledge/my_docs",
}
}
})
# Add documents
knowledge.add("./documents/", memory={"user_id": "user123"})
# Search
results = knowledge.search("query", user_id="user123", limit=10)
Normalization Guarantees
PraisonAI normalizes all backend results to ensure consistent behavior:
- metadata is ALWAYS a dict (never
None)
- text field is always present (mapped from
memory for mem0)
- score is always a float (defaults to 0.0)
This means you can safely access metadata without null checks:
# Safe - metadata is guaranteed to be a dict
for result in results['results']:
source = result.get('metadata', {}).get('source', 'unknown')
# This works even if the backend returns metadata=None
Protocol-Driven Architecture
All backends implement the KnowledgeStoreProtocol:
from praisonaiagents.knowledge import KnowledgeStoreProtocol
class MyCustomBackend:
"""Custom backend implementing the protocol."""
def search(self, query, *, user_id=None, agent_id=None, run_id=None, **kwargs):
# Your implementation
pass
def add(self, content, *, user_id=None, agent_id=None, run_id=None, **kwargs):
# Your implementation
pass
# ... other methods
Configuration Options
mem0 Backend (Default)
config = {
"vector_store": {
"provider": "qdrant", # mem0 uses qdrant by default
"config": {
"collection_name": "my_collection",
}
}
}
Chroma Backend
config = {
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "my_collection",
"path": "./.praison/knowledge/my_collection",
}
}
}
Error Handling
from praisonaiagents.knowledge import (
ScopeRequiredError,
BackendNotAvailableError,
)
try:
results = knowledge.search("query") # Missing scope!
except ScopeRequiredError as e:
print(f"Please provide user_id, agent_id, or run_id: {e}")
except BackendNotAvailableError as e:
print(f"Backend not available: {e}")
Collection Naming Rules
Enhanced Security (PR #1597): Knowledge stores now validate collection names to prevent SQL injection attacks.
Knowledge stores that interpolate collection names into DDL/DML now require collection names to match ^[A-Za-z0-9_]+$. Affected backends:
- Cassandra
- pgvector
- SingleStore vector
Invalid names raise: ValueError("collection_name must be non-empty and contain only alphanumerics and underscores")
Valid examples:
my_collection
UserData123
agent_v2_docs
Invalid examples:
my-collection (contains hyphen)
user.docs (contains dot)
data collection (contains space)
../../etc (path traversal attempt)
Best Practices
- Always provide scope identifiers for mem0 backend
- Use user_id for user-specific data (multi-tenant apps)
- Use agent_id for shared agent knowledge (company policies, FAQs)
- Use run_id for ephemeral session data (conversation context)
- Prefer Agent API over direct Knowledge API for most use cases
- Use alphanumeric collection names to ensure compatibility across all backends