Documentation Index
Fetch the complete documentation index at: https://docs.praison.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
vLLM provides high-throughput embedding inference for self-hosted deployments.
Quick Start
from praisonaiagents import embedding
result = embedding(
input="Hello world",
model="hosted_vllm/intfloat/e5-mistral-7b-instruct",
api_base="http://localhost:8000"
)
print(f"Dimensions: {len(result.embeddings[0])}")
CLI Usage
praisonai embed "Hello world" --model hosted_vllm/intfloat/e5-mistral-7b-instruct
Setup
- Start vLLM server with embedding model:
python -m vllm.entrypoints.openai.api_server \
--model intfloat/e5-mistral-7b-instruct \
--task embed
- Set environment variable:
export HOSTED_VLLM_API_BASE="http://localhost:8000"