Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.praison.ai/llms.txt

Use this file to discover all available pages before exploring further.

Splits text at sentence boundaries while respecting token limits. Preserves natural reading flow.

Quick Start

from praisonaiagents import Agent

agent = Agent(
    instructions="Answer questions from documents.",
    knowledge={
        "sources": ["article.pdf"],
        "chunker": {
            "type": "sentence",
            "chunk_size": 512,
            "chunk_overlap": 64
        }
    }
)

response = agent.start("What are the main arguments?")

When to Use

Good For

  • Articles and blog posts
  • Natural language content
  • Readability matters
  • Question-answering tasks

Consider Alternatives

  • Code or technical docs
  • Very long sentences
  • Structured data
  • Markdown with headers

Parameters

ParameterTypeDefaultDescription
chunk_sizeint512Max tokens per chunk
chunk_overlapint128Token overlap between chunks
tokenizer_or_token_counterstr"gpt2"Tokenizer for counting

Examples

News Articles

agent = Agent(
    instructions="Summarize news articles.",
    knowledge={
        "sources": ["news/"],
        "chunker": {
            "type": "sentence",
            "chunk_size": 256  # Short chunks for news
        }
    }
)

Long-form Content

agent = Agent(
    instructions="Analyze essays and papers.",
    knowledge={
        "sources": ["essays/"],
        "chunker": {
            "type": "sentence",
            "chunk_size": 1024,
            "chunk_overlap": 128
        }
    }
)

How It Works

Sentences are grouped together until the token limit is reached, then a new chunk starts.