Skip to main content

MCP Sampling

Sampling allows MCP servers to request LLM completions from clients. This enables servers to leverage the client’s LLM capabilities for text generation, with support for tool calling.

Protocol Version

This feature implements MCP Protocol Version 2025-11-25.

Tool Choice Modes

ModeDescription
autoModel decides whether to use tools
noneModel should not use tools
anyModel must use at least one tool
toolModel must use a specific tool

Python API

Basic Sampling

import asyncio
from praisonai.mcp_server.sampling import (
    SamplingHandler,
    SamplingRequest,
    SamplingMessage,
    create_sampling_request,
)

async def main():
    handler = SamplingHandler(default_model="gpt-4o-mini")
    
    # Create simple request
    request = create_sampling_request(
        prompt="What is the capital of France?",
        system_prompt="You are a helpful geography assistant.",
        max_tokens=100,
    )
    
    response = await handler.create_message(request)
    print(f"Response: {response.content}")
    print(f"Model: {response.model}")

asyncio.run(main())

Sampling with Tools

from praisonai.mcp_server.sampling import ToolChoice, ToolDefinition

request = create_sampling_request(
    prompt="Search for the latest AI news",
    tools=[{
        "name": "web_search",
        "description": "Search the web",
        "inputSchema": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"]
        }
    }],
    tool_choice="auto",  # or "none", "any", or specific tool name
)

response = await handler.create_message(request)
if response.tool_calls:
    print(f"Tool calls: {response.tool_calls}")

Tool Choice Factory Methods

from praisonai.mcp_server.sampling import ToolChoice

# Model decides
tc = ToolChoice.auto()
print(tc.to_dict())  # {"mode": "auto"}

# No tools
tc = ToolChoice.none()
print(tc.to_dict())  # {"mode": "none"}

# Must use any tool
tc = ToolChoice.any()
print(tc.to_dict())  # {"mode": "any"}

# Must use specific tool
tc = ToolChoice.tool("web_search")
print(tc.to_dict())  # {"mode": "tool", "name": "web_search"}

Model Preferences

from praisonai.mcp_server.sampling import ModelPreferences, SamplingRequest

prefs = ModelPreferences(
    hints=[{"name": "claude-3-sonnet"}, {"name": "gpt-4"}],
    cost_priority=0.3,      # 0-1, lower = prefer cheaper
    speed_priority=0.5,     # 0-1, lower = prefer faster
    intelligence_priority=0.8,  # 0-1, lower = prefer smarter
)

request = SamplingRequest(
    messages=[SamplingMessage(role="user", content="Hello!")],
    model_preferences=prefs,
    max_tokens=500,
)

Custom Callback

async def my_llm_callback(request):
    """Custom LLM integration."""
    # Call your LLM here
    return SamplingResponse(
        role="assistant",
        content="Custom response",
        model="my-model",
        stop_reason="end_turn",
    )

handler = SamplingHandler()
handler.set_callback(my_llm_callback)

MCP Protocol Messages

Sampling Request

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {"type": "text", "text": "Hello!"}
      }
    ],
    "maxTokens": 100,
    "systemPrompt": "You are helpful.",
    "modelPreferences": {
      "hints": [{"name": "claude-3-sonnet"}],
      "costPriority": 0.3,
      "speedPriority": 0.5,
      "intelligencePriority": 0.8
    },
    "tools": [
      {
        "name": "search",
        "description": "Search the web",
        "inputSchema": {"type": "object"}
      }
    ],
    "toolChoice": {"mode": "auto"}
  }
}

Sampling Response

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "role": "assistant",
    "content": {"type": "text", "text": "Hello! How can I help?"},
    "model": "claude-3-sonnet",
    "stopReason": "end_turn"
  }
}

Response with Tool Use

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "role": "assistant",
    "content": {"type": "text", "text": ""},
    "model": "claude-3-sonnet",
    "stopReason": "toolUse",
    "toolCalls": [
      {
        "id": "call_123",
        "name": "search",
        "arguments": {"query": "AI news"}
      }
    ]
  }
}

Stop Reasons

ReasonDescription
end_turnModel finished naturally
max_tokensHit token limit
toolUseModel wants to use a tool
errorAn error occurred

Best Practices

  1. Set appropriate max_tokens - Avoid unnecessary token usage
  2. Use model preferences - Guide model selection
  3. Handle tool calls - Process and respond to tool use
  4. Provide system prompts - Set context for better responses