Quick Start
Available Strategies
Choosing a Strategy
Agent Configuration
Simplest (Default Strategy)
With Chunking Config
All Chunker Options
| Option | Type | Default | Description |
|---|---|---|---|
type | str | "token" | Chunker type: token, sentence, recursive, semantic, sdpm, late |
chunk_size | int | 512 | Target tokens per chunk |
chunk_overlap | int | 128 | Overlap between chunks |
tokenizer_or_token_counter | str | "gpt2" | Tokenizer for counting |
embedding_model | str | auto | Embedding model (semantic/sdpm/late only) |
Strategy Details
Token Chunking
Fixed-size token chunks. Fast and predictable.
Sentence Chunking
Split at sentence boundaries. Natural flow.
Recursive Chunking
Hierarchical splitting. Great for markdown.
Semantic Chunking
Similarity-based splits. Topic coherence.
Installation
Chunking requires the knowledge extra:chonkie library automatically.
