Uses embeddings to split text at semantic boundaries. Groups related content together for better retrieval.Documentation Index
Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
Use this file to discover all available pages before exploring further.
Quick Start
When to Use
Good For
- Research papers
- Topic-dense content
- Multi-subject documents
- Quality over speed
Consider Alternatives
- Speed-critical pipelines
- Uniform chunk sizes needed
- Simple structured content
- Very short documents
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
chunk_size | int | 512 | Max tokens per chunk |
embedding_model | str | auto | Model for semantic similarity |
Examples
Research Analysis
Knowledge Base
How It Works
Semantic chunking:- Splits document into sentences
- Generates embeddings for each sentence
- Groups consecutive similar sentences
- Creates new chunk when topic changes
Performance Note
Semantic chunking requires computing embeddings and is slower than token/sentence chunking. Use for quality-sensitive applications where retrieval accuracy matters more than speed.
Embedding Models
The default embedding model isall-MiniLM-L6-v2. You can use any model supported by the chonkie library:

