Skip to main content

Knowledge Module

The Knowledge module provides a unified interface for building knowledge bases with document loading, vector storage, and intelligent retrieval.

Import

from praisonai.adapters import (
    AutoReader,
    ChromaVectorStore,
    BasicRetriever,
    FusionRetriever,
    LLMReranker
)

Quick Example

from praisonai.adapters import AutoReader, ChromaVectorStore, FusionRetriever

# 1. Load documents
reader = AutoReader()
docs = reader.load("./documents/")

# 2. Store in vector database
store = ChromaVectorStore(namespace="knowledge")
store.add(
    texts=[d.content for d in docs],
    embeddings=get_embeddings([d.content for d in docs])
)

# 3. Retrieve relevant documents
retriever = FusionRetriever(
    vector_store=store,
    embedding_fn=get_embedding,
    num_queries=3
)
results = retriever.retrieve("What is Python?", top_k=5)

Features

  • Document loading from files, directories, URLs
  • Vector storage with ChromaDB and Pinecone
  • Multiple retrieval strategies (Basic, Fusion, Recursive, AutoMerge)
  • Reranking for improved relevance
  • CLI integration for easy management

Architecture

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Readers   │────▶│ Vector Store │────▶│  Retriever  │
│  (Load)     │     │   (Store)    │     │  (Search)   │
└─────────────┘     └──────────────┘     └──────┬──────┘


                                         ┌─────────────┐
                                         │  Reranker   │
                                         │  (Refine)   │
                                         └─────────────┘

CLI Commands

Add Documents

# Add files to knowledge base
praisonai knowledge add document.pdf
praisonai knowledge add ./docs/
praisonai knowledge add "*.md"
praisonai knowledge add https://example.com/page.html

Query Knowledge Base

# Basic query
praisonai knowledge query "What is Python?"

# With options
praisonai knowledge query "Compare Python and Java" \
  --vector-store chroma \
  --retrieval fusion \
  --reranker llm \
  --top-k 5

Manage Knowledge Base

# List documents
praisonai knowledge list

# Clear knowledge base
praisonai knowledge clear

# Show stats
praisonai knowledge stats

CLI Options

OptionDescriptionDefault
--vector-storeVector store backendchroma
--retrievalRetrieval strategybasic
--rerankerReranking methodnone
--top-kNumber of results10
--workspaceWorkspace directoryCurrent dir

Example: Building a Documentation Assistant

from praisonai.adapters import (
    AutoReader,
    ChromaVectorStore,
    FusionRetriever,
    LLMReranker
)
from praisonaiagents import Agent

# Load documentation
reader = AutoReader()
docs = reader.load("./docs/")

# Create vector store
store = ChromaVectorStore(
    namespace="documentation",
    persist_directory=".praison/docs_kb"
)

# Add documents with embeddings
store.add(
    texts=[d.content for d in docs],
    embeddings=get_embeddings([d.content for d in docs]),
    metadatas=[d.metadata for d in docs]
)

# Create retriever
retriever = FusionRetriever(
    vector_store=store,
    embedding_fn=get_embedding,
    num_queries=3,
    top_k=10
)

# Create reranker
reranker = LLMReranker(model="gpt-4o-mini")

# Query function
def query_docs(question: str) -> str:
    # Retrieve
    results = retriever.retrieve(question, top_k=20)
    
    # Rerank
    reranked = reranker.rerank(
        question,
        [r.text for r in results],
        top_k=5
    )
    
    # Format context
    context = "\n\n".join([r.text for r in reranked])
    
    # Generate answer with agent
    agent = Agent(instructions="Answer based on the context provided")
    return agent.chat(f"Context:\n{context}\n\nQuestion: {question}")

# Use
answer = query_docs("How do I deploy the application?")
print(answer)

Retrieval Strategies

StrategyUse CaseCLI Flag
basicSimple queries--retrieval basic
fusionComplex questions--retrieval fusion
recursiveHierarchical docs--retrieval recursive
auto_mergeLong documents--retrieval auto_merge