Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.praison.ai/llms.txt

Use this file to discover all available pages before exploring further.

PraisonAI integrates chonkie for high-performance document chunking.

Quick Start

from praisonaiagents import Agent

# Default chunking (token-based)
agent = Agent(
    instructions="Answer questions from documents.",
    knowledge=["research.pdf", "docs/"]
)

response = agent.start("What are the key findings?")

Available Strategies

StrategyBest ForSpeed
tokenFixed-size chunks⚡ Fastest
sentenceNatural boundaries⚡ Fast
recursiveStructured docs (markdown)⚡ Fast
semanticTopic segmentation🔄 Medium
sdpmResearch papers🔄 Medium
lateBest embeddings🔄 Medium

Token Chunking

Fixed-size token chunks. Fast and predictable.

Sentence Chunking

Split at sentence boundaries. Natural flow.

Recursive Chunking

Hierarchical splitting. Great for markdown.

Semantic Chunking

Similarity-based splits. Topic coherence.

Chunker Configuration

All Parameters

ParameterTypeDefaultApplies To
typestr"token"All
chunk_sizeint512All
chunk_overlapint128token, sentence
tokenizer_or_token_counterstr"gpt2"token, sentence, recursive
embedding_modelstrautosemantic, sdpm, late

Strategy Examples

agent = Agent(
    instructions="Process documents.",
    knowledge={
        "sources": ["docs/"],
        "chunker": {
            "type": "token",
            "chunk_size": 256,
            "chunk_overlap": 50
        }
    }
)

Choosing a Strategy

Installation

pip install "praisonaiagents[knowledge]"
This installs the chonkie library automatically.

Knowledge Base

Configure knowledge sources and retrieval

RAG Agents

Build retrieval-augmented agents