Token Chunking

The simplest and fastest chunking strategy. Splits text into fixed-size token chunks with optional overlap.

Quick Start

from praisonaiagents import Agent

agent = Agent(
    instructions="Answer questions from documents.",
    knowledge={
        "sources": ["document.pdf"],
        "chunker": {
            "type": "token",
            "chunk_size": 256,
            "chunk_overlap": 50
        }
    }
)

response = agent.start("Summarize the main points")

When to Use

Good For

Speed-critical applications
Uniform chunk sizes needed
Simple documents without structure
High-volume processing

Consider Alternatives

Documents with natural sections
Topic-dependent content
Need semantic coherence
Complex nested structures

Parameters

Parameter	Type	Default	Description
`chunk_size`	int	512	Number of tokens per chunk
`chunk_overlap`	int	128	Token overlap between chunks
`tokenizer`	str	`"gpt2"`	Tokenizer for counting tokens

Examples

Small Chunks for Precision

agent = Agent(
    instructions="Find specific details.",
    knowledge={
        "sources": ["technical_spec.pdf"],
        "chunker": {
            "type": "token",
            "chunk_size": 128,   # Small for precision
            "chunk_overlap": 32
        }
    }
)

Large Chunks for Context

agent = Agent(
    instructions="Understand overall themes.",
    knowledge={
        "sources": ["novel.txt"],
        "chunker": {
            "type": "token",
            "chunk_size": 1024,  # Large for context
            "chunk_overlap": 256
        }
    }
)

How It Works

Token chunking is deterministic - the same document always produces the same chunks.

Getting Started

Core Concepts

Guides

Features

Models

Databases

Observability

Memory

Knowledge

RAG

Persistence

Tools

Other Features

Developers

Configuration

Best Practices

Getting Started (No Code)

Quick Start

When to Use

Good For

Consider Alternatives

Parameters

Examples

Small Chunks for Precision

Large Chunks for Context

How It Works

Getting Started

Core Concepts

Guides

Features

Models

Databases

Observability

Memory

Knowledge

RAG

Persistence

Tools

Other Features

Developers

Configuration

Best Practices

Getting Started (No Code)

​Quick Start

​When to Use

Good For

Consider Alternatives

​Parameters

​Examples

​Small Chunks for Precision

​Large Chunks for Context

​How It Works

Quick Start

When to Use

Parameters

Examples

Small Chunks for Precision

Large Chunks for Context

How It Works