Quick Start
When to Use
Good For
- Speed-critical applications
- Uniform chunk sizes needed
- Simple documents without structure
- High-volume processing
Consider Alternatives
- Documents with natural sections
- Topic-dependent content
- Need semantic coherence
- Complex nested structures
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
chunk_size | int | 512 | Number of tokens per chunk |
chunk_overlap | int | 128 | Token overlap between chunks |
tokenizer | str | "gpt2" | Tokenizer for counting tokens |

