Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.praison.ai/llms.txt

Use this file to discover all available pages before exploring further.

Late chunking embeds the entire document first, then splits. This produces chunks with better individual embeddings.

Quick Start

from praisonaiagents import Agent

agent = Agent(
    instructions="Answer questions with high precision.",
    knowledge={
        "sources": ["technical_docs/"],
        "chunker": {
            "type": "late",
            "chunk_size": 512,
            "embedding_model": "all-MiniLM-L6-v2"
        }
    }
)

response = agent.start("Explain the architecture")

When to Use

  • High-precision retrieval needed
  • Quality matters more than speed
  • Complex technical documents
  • Semantic similarity search critical

Parameters

ParameterTypeDefaultDescription
chunk_sizeint512Max tokens per chunk
embedding_modelstrautoEmbedding model

How It Works

Traditional chunking: Split → Embed each chunk Late chunking: Embed full doc → Split with context awareness This preserves document-level context in each chunk’s embedding.