AudioAgent
Defined in the audio_agent module.A specialized agent for audio processing using AI models. Provides:
- Text-to-Speech (TTS): Convert text to spoken audio
- Speech-to-Text (STT): Transcribe audio to text
- OpenAI:
openai/tts-1,openai/tts-1-hd - Azure:
azure/tts-1 - Gemini:
gemini/gemini-2.5-flash-preview-tts - Vertex AI:
vertex_ai/gemini-2.5-flash-preview-tts - ElevenLabs:
elevenlabs/eleven_multilingual_v2 - MiniMax:
minimax/speech-01
- OpenAI:
openai/whisper-1 - Azure:
azure/whisper - Groq:
groq/whisper-large-v3 - Deepgram:
deepgram/nova-2 - Gemini:
gemini/gemini-2.0-flash
Constructor
No description available.
No description available.
No description available.
No description available.
No description available.
No description available.
No description available.
No description available.
Methods
console()
Lazily initialize Rich Console.
litellm()
Lazy load litellm module when needed.
speech()
Convert text to speech.
aspeech()
Async version of speech().
transcribe()
Transcribe audio to text.
atranscribe()
Async version of transcribe().
say()
Quick TTS - convert text and save to file.
asay()
Async version of say().
listen()
Quick STT - transcribe audio file.
alisten()
Async version of listen().

