# Voice in Mastra Mastra's Voice system provides a unified interface for voice interactions, enabling text-to-speech (TTS), speech-to-text (STT), and real-time speech-to-speech (STS) capabilities in your applications. ## Adding Voice to Agents To learn how to integrate voice capabilities into your agents, check out the [Adding Voice to Agents](https://mastra.ai/docs/agents/adding-voice) documentation. This section covers how to use both single and multiple voice providers, as well as real-time interactions. ```typescript import { Agent } from "@mastra/core/agent"; import { OpenAIVoice } from "@mastra/voice-openai"; // Initialize OpenAI voice for TTS const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new OpenAIVoice(), }); ``` You can then use the following voice capabilities: ### Text to Speech (TTS) Turn your agent's responses into natural-sounding speech using Mastra's TTS capabilities. Choose from multiple providers like OpenAI, ElevenLabs, and more. For detailed configuration options and advanced features, check out our [Text-to-Speech guide](https://mastra.ai/docs/voice/text-to-speech). **OpenAI**: ```typescript import { Agent } from "@mastra/core/agent"; import { OpenAIVoice } from "@mastra/voice-openai"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new OpenAIVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "default", // Optional: specify a speaker responseFormat: "wav", // Optional: specify a response format }); playAudio(audioStream); ``` Visit the [OpenAI Voice Reference](https://mastra.ai/reference/voice/openai) for more information on the OpenAI voice provider. **Azure**: ```typescript import { Agent } from "@mastra/core/agent"; import { AzureVoice } from "@mastra/voice-azure"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new AzureVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "en-US-JennyNeural", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [Azure Voice Reference](https://mastra.ai/reference/voice/azure) for more information on the Azure voice provider. **ElevenLabs**: ```typescript import { Agent } from "@mastra/core/agent"; import { ElevenLabsVoice } from "@mastra/voice-elevenlabs"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new ElevenLabsVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "default", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [ElevenLabs Voice Reference](https://mastra.ai/reference/voice/elevenlabs) for more information on the ElevenLabs voice provider. **PlayAI**: ```typescript import { Agent } from "@mastra/core/agent"; import { PlayAIVoice } from "@mastra/voice-playai"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new PlayAIVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "default", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [PlayAI Voice Reference](https://mastra.ai/reference/voice/playai) for more information on the PlayAI voice provider. **Google**: ```typescript import { Agent } from "@mastra/core/agent"; import { GoogleVoice } from "@mastra/voice-google"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new GoogleVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "en-US-Studio-O", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [Google Voice Reference](https://mastra.ai/reference/voice/google) for more information on the Google voice provider. **Cloudflare**: ```typescript import { Agent } from "@mastra/core/agent"; import { CloudflareVoice } from "@mastra/voice-cloudflare"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new CloudflareVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "default", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [Cloudflare Voice Reference](https://mastra.ai/reference/voice/cloudflare) for more information on the Cloudflare voice provider. **Deepgram**: ```typescript import { Agent } from "@mastra/core/agent"; import { DeepgramVoice } from "@mastra/voice-deepgram"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new DeepgramVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "aura-english-us", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [Deepgram Voice Reference](https://mastra.ai/reference/voice/deepgram) for more information on the Deepgram voice provider. **Speechify**: ```typescript import { Agent } from "@mastra/core/agent"; import { SpeechifyVoice } from "@mastra/voice-speechify"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new SpeechifyVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "matthew", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [Speechify Voice Reference](https://mastra.ai/reference/voice/speechify) for more information on the Speechify voice provider. **Sarvam**: ```typescript import { Agent } from "@mastra/core/agent"; import { SarvamVoice } from "@mastra/voice-sarvam"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new SarvamVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "default", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [Sarvam Voice Reference](https://mastra.ai/reference/voice/sarvam) for more information on the Sarvam voice provider. **Murf**: ```typescript import { Agent } from "@mastra/core/agent"; import { MurfVoice } from "@mastra/voice-murf"; import { playAudio } from "@mastra/node-audio"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new MurfVoice(), }); const { text } = await voiceAgent.generate("What color is the sky?"); // Convert text to speech to an Audio Stream const audioStream = await voiceAgent.voice.speak(text, { speaker: "default", // Optional: specify a speaker }); playAudio(audioStream); ``` Visit the [Murf Voice Reference](https://mastra.ai/reference/voice/murf) for more information on the Murf voice provider. ### Speech to Text (STT) Transcribe spoken content using various providers like OpenAI, ElevenLabs, and more. For detailed configuration options and more, check out [Speech to Text](https://mastra.ai/docs/voice/speech-to-text). You can download a sample audio file from [here](https://github.com/mastra-ai/realtime-voice-demo/raw/refs/heads/main/how_can_i_help_you.mp3). [](https://github.com/mastra-ai/realtime-voice-demo/raw/refs/heads/main/how_can_i_help_you.mp3) **OpenAI**: ```typescript import { Agent } from "@mastra/core/agent"; import { OpenAIVoice } from "@mastra/voice-openai"; import { createReadStream } from "fs"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new OpenAIVoice(), }); // Use an audio file from a URL const audioStream = await createReadStream("./how_can_i_help_you.mp3"); // Convert audio to text const transcript = await voiceAgent.voice.listen(audioStream); console.log(`User said: ${transcript}`); // Generate a response based on the transcript const { text } = await voiceAgent.generate(transcript); ``` Visit the [OpenAI Voice Reference](https://mastra.ai/reference/voice/openai) for more information on the OpenAI voice provider. **Azure**: ```typescript import { createReadStream } from "fs"; import { Agent } from "@mastra/core/agent"; import { AzureVoice } from "@mastra/voice-azure"; import { createReadStream } from "fs"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new AzureVoice(), }); // Use an audio file from a URL const audioStream = await createReadStream("./how_can_i_help_you.mp3"); // Convert audio to text const transcript = await voiceAgent.voice.listen(audioStream); console.log(`User said: ${transcript}`); // Generate a response based on the transcript const { text } = await voiceAgent.generate(transcript); ``` Visit the [Azure Voice Reference](https://mastra.ai/reference/voice/azure) for more information on the Azure voice provider. **ElevenLabs**: ```typescript import { Agent } from "@mastra/core/agent"; import { ElevenLabsVoice } from "@mastra/voice-elevenlabs"; import { createReadStream } from "fs"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new ElevenLabsVoice(), }); // Use an audio file from a URL const audioStream = await createReadStream("./how_can_i_help_you.mp3"); // Convert audio to text const transcript = await voiceAgent.voice.listen(audioStream); console.log(`User said: ${transcript}`); // Generate a response based on the transcript const { text } = await voiceAgent.generate(transcript); ``` Visit the [ElevenLabs Voice Reference](https://mastra.ai/reference/voice/elevenlabs) for more information on the ElevenLabs voice provider. **Google**: ```typescript import { Agent } from "@mastra/core/agent"; import { GoogleVoice } from "@mastra/voice-google"; import { createReadStream } from "fs"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new GoogleVoice(), }); // Use an audio file from a URL const audioStream = await createReadStream("./how_can_i_help_you.mp3"); // Convert audio to text const transcript = await voiceAgent.voice.listen(audioStream); console.log(`User said: ${transcript}`); // Generate a response based on the transcript const { text } = await voiceAgent.generate(transcript); ``` Visit the [Google Voice Reference](https://mastra.ai/reference/voice/google) for more information on the Google voice provider. **Cloudflare**: ```typescript import { Agent } from "@mastra/core/agent"; import { CloudflareVoice } from "@mastra/voice-cloudflare"; import { createReadStream } from "fs"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new CloudflareVoice(), }); // Use an audio file from a URL const audioStream = await createReadStream("./how_can_i_help_you.mp3"); // Convert audio to text const transcript = await voiceAgent.voice.listen(audioStream); console.log(`User said: ${transcript}`); // Generate a response based on the transcript const { text } = await voiceAgent.generate(transcript); ``` Visit the [Cloudflare Voice Reference](https://mastra.ai/reference/voice/cloudflare) for more information on the Cloudflare voice provider. **Deepgram**: ```typescript import { Agent } from "@mastra/core/agent"; import { DeepgramVoice } from "@mastra/voice-deepgram"; import { createReadStream } from "fs"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new DeepgramVoice(), }); // Use an audio file from a URL const audioStream = await createReadStream("./how_can_i_help_you.mp3"); // Convert audio to text const transcript = await voiceAgent.voice.listen(audioStream); console.log(`User said: ${transcript}`); // Generate a response based on the transcript const { text } = await voiceAgent.generate(transcript); ``` Visit the [Deepgram Voice Reference](https://mastra.ai/reference/voice/deepgram) for more information on the Deepgram voice provider. **Sarvam**: ```typescript import { Agent } from "@mastra/core/agent"; import { SarvamVoice } from "@mastra/voice-sarvam"; import { createReadStream } from "fs"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new SarvamVoice(), }); // Use an audio file from a URL const audioStream = await createReadStream("./how_can_i_help_you.mp3"); // Convert audio to text const transcript = await voiceAgent.voice.listen(audioStream); console.log(`User said: ${transcript}`); // Generate a response based on the transcript const { text } = await voiceAgent.generate(transcript); ``` Visit the [Sarvam Voice Reference](https://mastra.ai/reference/voice/sarvam) for more information on the Sarvam voice provider. ### Speech to Speech (STS) Create conversational experiences with speech-to-speech capabilities. The unified API enables real-time voice interactions between users and AI agents. For detailed configuration options and advanced features, check out [Speech to Speech](https://mastra.ai/docs/voice/speech-to-speech). **OpenAI**: ```typescript import { Agent } from "@mastra/core/agent"; import { playAudio, getMicrophoneStream } from "@mastra/node-audio"; import { OpenAIRealtimeVoice } from "@mastra/voice-openai-realtime"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new OpenAIRealtimeVoice(), }); // Listen for agent audio responses voiceAgent.voice.on("speaker", ({ audio }) => { playAudio(audio); }); // Initiate the conversation await voiceAgent.voice.speak("How can I help you today?"); // Send continuous audio from the microphone const micStream = getMicrophoneStream(); await voiceAgent.voice.send(micStream); ``` Visit the [OpenAI Voice Reference](https://mastra.ai/reference/voice/openai-realtime) for more information on the OpenAI voice provider. **Google**: ```typescript import { Agent } from "@mastra/core/agent"; import { playAudio, getMicrophoneStream } from "@mastra/node-audio"; import { GeminiLiveVoice } from "@mastra/voice-google-gemini-live"; const voiceAgent = new Agent({ id: "voice-agent", name: "Voice Agent", instructions: "You are a voice assistant that can help users with their tasks.", model: "openai/gpt-5.1", voice: new GeminiLiveVoice({ // Live API mode apiKey: process.env.GOOGLE_API_KEY, model: "gemini-2.0-flash-exp", speaker: "Puck", debug: true, // Vertex AI alternative: // vertexAI: true, // project: 'your-gcp-project', // location: 'us-central1', // serviceAccountKeyFile: '/path/to/service-account.json', }), }); // Connect before using speak/send await voiceAgent.voice.connect(); // Listen for agent audio responses voiceAgent.voice.on("speaker", ({ audio }) => { playAudio(audio); }); // Listen for text responses and transcriptions voiceAgent.voice.on("writing", ({ text, role }) => { console.log(`${role}: ${text}`); }); // Initiate the conversation await voiceAgent.voice.speak("How can I help you today?"); // Send continuous audio from the microphone const micStream = getMicrophoneStream(); await voiceAgent.voice.send(micStream); ``` Visit the [Google Gemini Live Reference](https://mastra.ai/reference/voice/google-gemini-live) for more information on the Google Gemini Live voice provider. ## Voice Configuration Each voice provider can be configured with different models and options. Below are the detailed configuration options for all supported providers: **OpenAI**: ```typescript // OpenAI Voice Configuration const voice = new OpenAIVoice({ speechModel: { name: "gpt-3.5-turbo", // Example model name apiKey: process.env.OPENAI_API_KEY, language: "en-US", // Language code voiceType: "neural", // Type of voice model }, listeningModel: { name: "whisper-1", // Example model name apiKey: process.env.OPENAI_API_KEY, language: "en-US", // Language code format: "wav", // Audio format }, speaker: "alloy", // Example speaker name }); ``` Visit the [OpenAI Voice Reference](https://mastra.ai/reference/voice/openai) for more information on the OpenAI voice provider. **Azure**: ```typescript // Azure Voice Configuration const voice = new AzureVoice({ speechModel: { name: "en-US-JennyNeural", // Example model name apiKey: process.env.AZURE_SPEECH_KEY, region: process.env.AZURE_SPEECH_REGION, language: "en-US", // Language code style: "cheerful", // Voice style pitch: "+0Hz", // Pitch adjustment rate: "1.0", // Speech rate }, listeningModel: { name: "en-US", // Example model name apiKey: process.env.AZURE_SPEECH_KEY, region: process.env.AZURE_SPEECH_REGION, format: "simple", // Output format }, }); ``` Visit the [Azure Voice Reference](https://mastra.ai/reference/voice/azure) for more information on the Azure voice provider. **ElevenLabs**: ```typescript // ElevenLabs Voice Configuration const voice = new ElevenLabsVoice({ speechModel: { voiceId: "your-voice-id", // Example voice ID model: "eleven_multilingual_v2", // Example model name apiKey: process.env.ELEVENLABS_API_KEY, language: "en", // Language code emotion: "neutral", // Emotion setting }, // ElevenLabs may not have a separate listening model }); ``` Visit the [ElevenLabs Voice Reference](https://mastra.ai/reference/voice/elevenlabs) for more information on the ElevenLabs voice provider. **PlayAI**: ```typescript // PlayAI Voice Configuration const voice = new PlayAIVoice({ speechModel: { name: "playai-voice", // Example model name speaker: "emma", // Example speaker name apiKey: process.env.PLAYAI_API_KEY, language: "en-US", // Language code speed: 1.0, // Speech speed }, // PlayAI may not have a separate listening model }); ``` Visit the [PlayAI Voice Reference](https://mastra.ai/reference/voice/playai) for more information on the PlayAI voice provider. **Google**: ```typescript // Google Voice Configuration const voice = new GoogleVoice({ speechModel: { name: "en-US-Studio-O", // Example model name apiKey: process.env.GOOGLE_API_KEY, languageCode: "en-US", // Language code gender: "FEMALE", // Voice gender speakingRate: 1.0, // Speaking rate }, listeningModel: { name: "en-US", // Example model name sampleRateHertz: 16000, // Sample rate }, }); ``` Visit the [Google Voice Reference](https://mastra.ai/reference/voice/google) for more information on the Google voice provider. **Cloudflare**: ```typescript // Cloudflare Voice Configuration const voice = new CloudflareVoice({ speechModel: { name: "cloudflare-voice", // Example model name accountId: process.env.CLOUDFLARE_ACCOUNT_ID, apiToken: process.env.CLOUDFLARE_API_TOKEN, language: "en-US", // Language code format: "mp3", // Audio format }, // Cloudflare may not have a separate listening model }); ``` Visit the [Cloudflare Voice Reference](https://mastra.ai/reference/voice/cloudflare) for more information on the Cloudflare voice provider. **Deepgram**: ```typescript // Deepgram Voice Configuration const voice = new DeepgramVoice({ speechModel: { name: "nova-2", // Example model name speaker: "aura-english-us", // Example speaker name apiKey: process.env.DEEPGRAM_API_KEY, language: "en-US", // Language code tone: "formal", // Tone setting }, listeningModel: { name: "nova-2", // Example model name format: "flac", // Audio format }, }); ``` Visit the [Deepgram Voice Reference](https://mastra.ai/reference/voice/deepgram) for more information on the Deepgram voice provider. **Speechify**: ```typescript // Speechify Voice Configuration const voice = new SpeechifyVoice({ speechModel: { name: "speechify-voice", // Example model name speaker: "matthew", // Example speaker name apiKey: process.env.SPEECHIFY_API_KEY, language: "en-US", // Language code speed: 1.0, // Speech speed }, // Speechify may not have a separate listening model }); ``` Visit the [Speechify Voice Reference](https://mastra.ai/reference/voice/speechify) for more information on the Speechify voice provider. **Sarvam**: ```typescript // Sarvam Voice Configuration const voice = new SarvamVoice({ speechModel: { name: "sarvam-voice", // Example model name apiKey: process.env.SARVAM_API_KEY, language: "en-IN", // Language code style: "conversational", // Style setting }, // Sarvam may not have a separate listening model }); ``` Visit the [Sarvam Voice Reference](https://mastra.ai/reference/voice/sarvam) for more information on the Sarvam voice provider. **Murf**: ```typescript // Murf Voice Configuration const voice = new MurfVoice({ speechModel: { name: "murf-voice", // Example model name apiKey: process.env.MURF_API_KEY, language: "en-US", // Language code emotion: "happy", // Emotion setting }, // Murf may not have a separate listening model }); ``` Visit the [Murf Voice Reference](https://mastra.ai/reference/voice/murf) for more information on the Murf voice provider. **OpenAI Realtime**: ```typescript // OpenAI Realtime Voice Configuration const voice = new OpenAIRealtimeVoice({ speechModel: { name: "gpt-3.5-turbo", // Example model name apiKey: process.env.OPENAI_API_KEY, language: "en-US", // Language code }, listeningModel: { name: "whisper-1", // Example model name apiKey: process.env.OPENAI_API_KEY, format: "ogg", // Audio format }, speaker: "alloy", // Example speaker name }); ``` For more information on the OpenAI Realtime voice provider, refer to the [OpenAI Realtime Voice Reference](https://mastra.ai/reference/voice/openai-realtime). **Google Gemini Live**: ```typescript // Google Gemini Live Voice Configuration const voice = new GeminiLiveVoice({ speechModel: { name: "gemini-2.0-flash-exp", // Example model name apiKey: process.env.GOOGLE_API_KEY, }, speaker: "Puck", // Example speaker name // Google Gemini Live is a realtime bidirectional API without separate speech and listening models }); ``` Visit the [Google Gemini Live Reference](https://mastra.ai/reference/voice/google-gemini-live) for more information on the Google Gemini Live voice provider. **AI SDK**: ```typescript // AI SDK Voice Configuration import { CompositeVoice } from "@mastra/core/voice"; import { openai } from "@ai-sdk/openai"; import { elevenlabs } from "@ai-sdk/elevenlabs"; // Use AI SDK models directly - no need to install separate packages const voice = new CompositeVoice({ input: openai.transcription('whisper-1'), // AI SDK transcription output: elevenlabs.speech('eleven_turbo_v2'), // AI SDK speech }); // Works seamlessly with your agent const voiceAgent = new Agent({ id: "aisdk-voice-agent", name: "AI SDK Voice Agent", instructions: "You are a helpful assistant with voice capabilities.", model: "openai/gpt-5.1", voice, }); ``` ### Using Multiple Voice Providers This example demonstrates how to create and use two different voice providers in Mastra: OpenAI for speech-to-text (STT) and PlayAI for text-to-speech (TTS). Start by creating instances of the voice providers with any necessary configuration. ```typescript import { OpenAIVoice } from "@mastra/voice-openai"; import { PlayAIVoice } from "@mastra/voice-playai"; import { CompositeVoice } from "@mastra/core/voice"; import { playAudio, getMicrophoneStream } from "@mastra/node-audio"; // Initialize OpenAI voice for STT const input = new OpenAIVoice({ listeningModel: { name: "whisper-1", apiKey: process.env.OPENAI_API_KEY, }, }); // Initialize PlayAI voice for TTS const output = new PlayAIVoice({ speechModel: { name: "playai-voice", apiKey: process.env.PLAYAI_API_KEY, }, }); // Combine the providers using CompositeVoice const voice = new CompositeVoice({ input, output, }); // Implement voice interactions using the combined voice provider const audioStream = getMicrophoneStream(); // Assume this function gets audio input const transcript = await voice.listen(audioStream); // Log the transcribed text console.log("Transcribed text:", transcript); // Convert text to speech const responseAudio = await voice.speak(`You said: ${transcript}`, { speaker: "default", // Optional: specify a speaker, responseFormat: "wav", // Optional: specify a response format }); // Play the audio response playAudio(responseAudio); ``` ### Using AI SDK Model Providers You can also use AI SDK models directly with `CompositeVoice`: ```typescript import { CompositeVoice } from "@mastra/core/voice"; import { openai } from "@ai-sdk/openai"; import { elevenlabs } from "@ai-sdk/elevenlabs"; import { playAudio, getMicrophoneStream } from "@mastra/node-audio"; // Use AI SDK models directly - no provider setup needed const voice = new CompositeVoice({ input: openai.transcription('whisper-1'), // AI SDK transcription output: elevenlabs.speech('eleven_turbo_v2'), // AI SDK speech }); // Works the same way as Mastra providers const audioStream = getMicrophoneStream(); const transcript = await voice.listen(audioStream); console.log("Transcribed text:", transcript); // Convert text to speech const responseAudio = await voice.speak(`You said: ${transcript}`, { speaker: "Rachel", // ElevenLabs voice }); playAudio(responseAudio); ``` You can also mix AI SDK models with Mastra providers: ```typescript import { CompositeVoice } from "@mastra/core/voice"; import { PlayAIVoice } from "@mastra/voice-playai"; import { groq } from "@ai-sdk/groq"; const voice = new CompositeVoice({ input: groq.transcription('whisper-large-v3'), // AI SDK for STT output: new PlayAIVoice(), // Mastra provider for TTS }); ``` For more information on the CompositeVoice, refer to the [CompositeVoice Reference](https://mastra.ai/reference/voice/composite-voice). ## More Resources - [CompositeVoice](https://mastra.ai/reference/voice/composite-voice) - [MastraVoice](https://mastra.ai/reference/voice/mastra-voice) - [OpenAI Voice](https://mastra.ai/reference/voice/openai) - [OpenAI Realtime Voice](https://mastra.ai/reference/voice/openai-realtime) - [Azure Voice](https://mastra.ai/reference/voice/azure) - [Google Voice](https://mastra.ai/reference/voice/google) - [Google Gemini Live Voice](https://mastra.ai/reference/voice/google-gemini-live) - [Deepgram Voice](https://mastra.ai/reference/voice/deepgram) - [PlayAI Voice](https://mastra.ai/reference/voice/playai) - [Voice Examples](https://github.com/mastra-ai/voice-examples)