Quick Start
When to Use
Good For
- Articles and blog posts
- Natural language content
- Readability matters
- Question-answering tasks
Consider Alternatives
- Code or technical docs
- Very long sentences
- Structured data
- Markdown with headers
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
chunk_size | int | 512 | Max tokens per chunk |
chunk_overlap | int | 128 | Token overlap between chunks |
tokenizer_or_token_counter | str | "gpt2" | Tokenizer for counting |

