Documentation Index
Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
Use this file to discover all available pages before exploring further.
Retrieval Strategies Module
The retrieval module provides various strategies for finding relevant documents from the knowledge base.
Quick Start
from praisonaiagents.knowledge.retrieval import (
RetrievalStrategy,
RetrievalResult,
RetrieverProtocol,
get_retriever_registry,
reciprocal_rank_fusion,
merge_adjacent_chunks
)
# Use built-in RRF fusion
results_list = [
[{"id": "1", "score": 0.9}, {"id": "2", "score": 0.8}],
[{"id": "2", "score": 0.95}, {"id": "3", "score": 0.7}]
]
fused = reciprocal_rank_fusion(results_list, k=60)
Retrieval Strategies
RetrievalStrategy Enum
from praisonaiagents.knowledge.retrieval import RetrievalStrategy
class RetrievalStrategy(Enum):
BASIC = "basic" # Simple vector similarity
FUSION = "fusion" # Multi-query with RRF
RECURSIVE = "recursive" # Depth-limited recursive
AUTO_MERGE = "auto_merge" # Parent-child merging
Strategy Descriptions
| Strategy | Description | Use Case |
|---|
basic | Simple vector similarity search | General queries |
fusion | Multiple queries + Reciprocal Rank Fusion | Complex queries |
recursive | Follows references between chunks | Hierarchical docs |
auto_merge | Merges child chunks into parents | Long documents |
Classes
RetrievalResult
Dataclass for retrieval results.
@dataclass
class RetrievalResult:
text: str
score: float
metadata: Dict[str, Any] = field(default_factory=dict)
doc_id: Optional[str] = None
RetrieverProtocol
Protocol for retriever implementations.
class RetrieverProtocol(Protocol):
name: str
strategy: RetrievalStrategy
def retrieve(
self,
query: str,
top_k: int = 10,
**kwargs
) -> List[RetrievalResult]:
"""Retrieve relevant documents."""
...
Utility Functions
reciprocal_rank_fusion
Combine results from multiple retrievers using RRF.
from praisonaiagents.knowledge.retrieval import reciprocal_rank_fusion
# Results from multiple queries/retrievers
results_a = [{"id": "1", "score": 0.9}, {"id": "2", "score": 0.8}]
results_b = [{"id": "2", "score": 0.95}, {"id": "3", "score": 0.7}]
# Fuse with RRF (k=60 is standard)
fused = reciprocal_rank_fusion([results_a, results_b], k=60)
# Returns: [{"id": "2", "rrf_score": ...}, {"id": "1", ...}, {"id": "3", ...}]
merge_adjacent_chunks
Merge consecutive chunks from the same document.
from praisonaiagents.knowledge.retrieval import merge_adjacent_chunks
chunks = [
{"text": "Part 1", "doc_id": "doc1", "chunk_idx": 0},
{"text": "Part 2", "doc_id": "doc1", "chunk_idx": 1},
{"text": "Other", "doc_id": "doc2", "chunk_idx": 0}
]
merged = merge_adjacent_chunks(chunks)
# Merges adjacent chunks from same document
Using with Knowledge
from praisonaiagents import Agent, Knowledge
# Configure retrieval strategy
agent = Agent(
instructions="You are a helpful assistant",
knowledge={
"sources": ["docs/"],
"retrieval_strategy": "fusion", # Use fusion retrieval
"top_k": 10
}
"retrieval_strategy": "fusion", # Use fusion retrieval
"top_k": 10
}
)
response = agent.chat("What is the architecture?")
Creating Custom Retrievers
from praisonaiagents.knowledge.retrieval import (
RetrievalStrategy,
RetrievalResult,
get_retriever_registry
)
class MyRetriever:
name = "my_retriever"
strategy = RetrievalStrategy.BASIC
def __init__(self, vector_store, **config):
self.store = vector_store
def retrieve(
self,
query: str,
top_k: int = 10,
**kwargs
) -> List[RetrievalResult]:
# Custom retrieval logic
...
# Register
registry = get_retriever_registry()
registry.register("my_retriever", MyRetriever)
- All utility functions are pure Python (no external deps)
- RRF fusion is O(n log n) where n is total results
- Chunk merging is O(n) with single pass