Skip to main content

RAG Module

The RAG module provides a thin orchestration layer over Knowledge for retrieval-augmented generation.

Core Classes

RAG

The main pipeline class that orchestrates retrieval and generation.
from praisonaiagents.rag import RAG, RAGConfig

rag = RAG(
    knowledge=knowledge,           # Required: Knowledge instance
    llm=None,                      # Optional: LLM instance
    config=RAGConfig(),            # Optional: Configuration
    reranker=None,                 # Optional: Reranker instance
    context_builder=None,          # Optional: Custom context builder
    citation_formatter=None,       # Optional: Custom citation formatter
)

Methods

query(question, **kwargs) -> RAGResult
Execute a RAG query and return result with citations.
result = rag.query("What is the main finding?")
print(result.answer)
print(result.citations)
aquery(question, **kwargs) -> RAGResult
Async version of query.
result = await rag.aquery("What is the conclusion?")
stream(question, **kwargs) -> Iterator[str]
Stream response tokens.
for chunk in rag.stream("Summarize the document"):
    print(chunk, end="", flush=True)
astream(question, **kwargs) -> AsyncIterator[str]
Async streaming.
async for chunk in rag.astream("Explain the methodology"):
    print(chunk, end="", flush=True)
get_citations(question, **kwargs) -> List[Citation]
Get citations without generating an answer.
citations = rag.get_citations("What sources mention X?")
for c in citations:
    print(f"[{c.id}] {c.source}: {c.text[:100]}")

RAGConfig

Configuration for the RAG pipeline.
from praisonaiagents.rag import RAGConfig, RetrievalStrategy

config = RAGConfig(
    top_k=5,                                    # Chunks to retrieve
    min_score=0.0,                              # Minimum relevance score
    max_context_tokens=4000,                    # Context token limit
    include_citations=True,                     # Include citations
    retrieval_strategy=RetrievalStrategy.BASIC, # Retrieval strategy
    rerank=False,                               # Enable reranking
    rerank_top_k=3,                             # Results after rerank
    template="...",                             # Prompt template
    system_prompt=None,                         # System prompt
    stream=False,                               # Stream by default
)

Retrieval Strategies

from praisonaiagents.rag import RetrievalStrategy

RetrievalStrategy.BASIC   # Simple vector search
RetrievalStrategy.FUSION  # Reciprocal rank fusion
RetrievalStrategy.HYBRID  # Dense + sparse retrieval

RAGResult

Result from a RAG query.
@dataclass
class RAGResult:
    answer: str                    # Generated answer
    citations: List[Citation]      # Source citations
    context_used: str              # Context passed to LLM
    query: str                     # Original query
    metadata: Dict[str, Any]       # Timing, stats, etc.

Properties

  • has_citations - Boolean indicating if citations exist
  • format_answer_with_citations() - Format answer with source references

Citation

Source citation for RAG answers.
@dataclass
class Citation:
    id: str                        # Citation ID (e.g., "1")
    source: str                    # Source document
    text: str                      # Text snippet
    score: float                   # Relevance score
    doc_id: Optional[str]          # Document identifier
    chunk_id: Optional[str]        # Chunk identifier
    offset: Optional[int]          # Character offset
    metadata: Dict[str, Any]       # Additional metadata

Protocols

The RAG module uses protocols for extensibility.

ContextBuilderProtocol

Custom context assembly logic.
from praisonaiagents.rag.protocols import ContextBuilderProtocol

class MyContextBuilder:
    def build(
        self,
        results: List[Dict[str, Any]],
        max_tokens: int = 4000,
        deduplicate: bool = True,
    ) -> str:
        # Custom context building logic
        return assembled_context

CitationFormatterProtocol

Custom citation formatting.
from praisonaiagents.rag.protocols import CitationFormatterProtocol

class MyCitationFormatter:
    def format(
        self,
        results: List[Dict[str, Any]],
        start_id: int = 1,
    ) -> List[Citation]:
        # Custom citation formatting
        return citations

Context Utilities

Helper functions for context building.
from praisonaiagents.rag import build_context, truncate_context, deduplicate_chunks

# Build context from results
context, used_results = build_context(
    results=search_results,
    max_tokens=4000,
    deduplicate=True,
)

# Truncate long context
truncated = truncate_context(text, max_tokens=2000)

# Remove duplicate chunks
unique = deduplicate_chunks(results)

Integration with Knowledge

RAG uses Knowledge for all retrieval operations:
from praisonaiagents import Knowledge
from praisonaiagents.rag import RAG

# Knowledge handles indexing
knowledge = Knowledge()
knowledge.add("documents/")

# RAG handles answering
rag = RAG(knowledge=knowledge)
result = rag.query("What is discussed?")

Error Handling

from praisonaiagents.rag import RAG

rag = RAG(knowledge=knowledge)

try:
    result = rag.query("Question")
except Exception as e:
    print(f"RAG error: {e}")

Performance Tips

  1. Batch indexing: Add multiple documents at once
  2. Tune top_k: Start with 5, adjust based on quality
  3. Use min_score: Filter low-relevance results
  4. Enable reranking: For higher precision (costs latency)
  5. Stream responses: Better UX for long answers