Quick Start
1
Install
2
Create Agent with Knowledge
3
Get Answer with Citations
How RAG Works
The RAG Process
| Phase | What Happens | When |
|---|---|---|
| Indexing | Documents → Chunks → Embeddings → Vector DB | Once per document |
| Retrieval | Query → Embedding → Similarity Search → Top-K chunks | Every query |
| Generation | Query + Context → LLM → Answer + Citations | Every query |
Agent Methods for RAG
| Method | Returns | Use Case |
|---|---|---|
agent.start(prompt) | String | Interactive terminal with RAG |
agent.chat(prompt) | String | Programmatic RAG |
agent.query(prompt) | RAGResult | Structured answer + citations |
agent.retrieve(prompt) | ContextPack | Context only, no LLM generation |
Example: All Methods
Knowledge Configuration
Basic: File List
Advanced: Full Configuration
Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
sources | list | [] | Files, folders, or URLs |
retrieval_k | int | 5 | Number of chunks to retrieve |
rerank | bool | False | Enable reranking |
chunking_strategy | str | "semantic" | How to split documents |
chunk_size | int | 1000 | Max tokens per chunk |
chunk_overlap | int | 200 | Overlap between chunks |
auto_retrieve | bool | True | Auto-inject context |
Retrieval Strategies
PraisonAI automatically selects the optimal strategy based on corpus size:| Strategy | Files | Technique |
|---|---|---|
| DIRECT | < 10 | Load all content into context |
| BASIC | < 100 | Embedding-based semantic search |
| HYBRID | < 1000 | Keyword + semantic search |
| RERANKED | < 10000 | Hybrid + cross-encoder reranking |
| COMPRESSED | < 100000 | Reranked + contextual compression |
| HIERARCHICAL | ≥ 100000 | Summaries + top-down routing |
Force a Strategy
RAG vs Knowledge vs Memory vs Context
Comparison Table
| Aspect | RAG | Knowledge | Memory | Context |
|---|---|---|---|---|
| What | Search technique | Pre-loaded docs | Persistent storage | Runtime data |
| When | Query time | Before execution | Across sessions | During execution |
| Lifetime | N/A | Permanent | Permanent | Session only |
| Direction | Read-only | Read-only | Read + Write | Read-only |
| Agent Param | Part of knowledge= | knowledge= | memory= | context= |
When to Use What
Agent with Knowledge (Simple)
For small document sets where you don’t need advanced retrieval:Agent with RAG (Advanced)
For large document sets requiring sophisticated retrieval:Multi-Agent RAG
Share knowledge across multiple agents:Supported File Types
Documents
PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX
Text
TXT, CSV, JSON, XML, MD, HTML
Media
Images (with OCR), Audio (with transcription)
Vector Store Backends
Citations
RAG automatically provides source citations:Citation Modes
| Mode | Description |
|---|---|
APPEND | Add sources at end of response |
INLINE | Insert citation markers in text |
HIDDEN | Include in metadata only |
Best Practices
Chunk Size
Use 500-1000 tokens per chunk. Too small = lost context. Too large = noise.
Overlap
Use 10-20% overlap to preserve context across chunk boundaries.
Reranking
Enable reranking for large corpora to improve relevance.
Top-K
Start with k=5, increase if answers lack detail.
Troubleshooting
No relevant results
No relevant results
- Check if documents were indexed:
knowledge.stats() - Lower the similarity threshold
- Try different chunking strategy
Slow retrieval
Slow retrieval
- Reduce
retrieval_k - Disable reranking for faster results
- Use persistent vector store
Missing context
Missing context
- Increase chunk overlap
- Use semantic chunking
- Increase
retrieval_k

