Skip to main content

Overview

Create realtime sessions for audio and video streaming with OpenAI’s Realtime API.

Python Usage

Create Session

from praisonai.capabilities import realtime_connect

session = realtime_connect(
    model="gpt-4o-realtime-preview",
    modalities=["text", "audio"],
    voice="alloy"
)

print(f"Session ID: {session.id}")
print(f"URL: {session.url}")
print(f"Status: {session.status}")

Session Configuration

from praisonai.capabilities import realtime_connect

session = realtime_connect(
    model="gpt-4o-realtime-preview",
    modalities=["text", "audio"],
    voice="shimmer",
    instructions="You are a helpful voice assistant."
)

# Connect via WebSocket to session.url
print(f"Connect to: {session.url}")

Async Usage

import asyncio
from praisonai.capabilities import arealtime_connect

async def main():
    session = await arealtime_connect(
        model="gpt-4o-realtime-preview"
    )
    print(f"Session: {session.id}")

asyncio.run(main())

Parameters

ParameterTypeDefaultDescription
modelstr”gpt-4o-realtime-preview”Realtime model
modalitiesList[str][“text”, “audio”]Supported modalities
instructionsstrNoneSystem instructions
voicestr”alloy”Voice for audio output
api_keystrNoneAPI key override

Available Voices

  • alloy - Neutral
  • echo - Male
  • fable - British
  • onyx - Deep male
  • nova - Female
  • shimmer - Soft female

Result Object

The RealtimeSession object contains:
  • id: Session ID
  • status: Session status
  • model: Model being used
  • url: WebSocket URL to connect to
  • metadata: Session configuration

WebSocket Connection

After creating a session, connect via WebSocket:
import websockets
import json

async def connect_realtime(session):
    async with websockets.connect(
        session.url,
        extra_headers={"Authorization": f"Bearer {api_key}"}
    ) as ws:
        # Send and receive events
        await ws.send(json.dumps({
            "type": "input_audio_buffer.append",
            "audio": base64_audio_data
        }))