Documentation Index
Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
Use this file to discover all available pages before exploring further.
Splits text at sentence boundaries while respecting token limits. Preserves natural reading flow.
Quick Start
from praisonaiagents import Agent
agent = Agent(
instructions="Answer questions from documents.",
knowledge={
"sources": ["article.pdf"],
"chunker": {
"type": "sentence",
"chunk_size": 512,
"chunk_overlap": 64
}
}
)
response = agent.start("What are the main arguments?")
When to Use
Good For
- Articles and blog posts
- Natural language content
- Readability matters
- Question-answering tasks
Consider Alternatives
- Code or technical docs
- Very long sentences
- Structured data
- Markdown with headers
Parameters
| Parameter | Type | Default | Description |
|---|
chunk_size | int | 512 | Max tokens per chunk |
chunk_overlap | int | 128 | Token overlap between chunks |
tokenizer_or_token_counter | str | "gpt2" | Tokenizer for counting |
Examples
News Articles
agent = Agent(
instructions="Summarize news articles.",
knowledge={
"sources": ["news/"],
"chunker": {
"type": "sentence",
"chunk_size": 256 # Short chunks for news
}
}
)
Long-form Content
agent = Agent(
instructions="Analyze essays and papers.",
knowledge={
"sources": ["essays/"],
"chunker": {
"type": "sentence",
"chunk_size": 1024,
"chunk_overlap": 128
}
}
)
How It Works
Sentences are grouped together until the token limit is reached, then a new chunk starts.