Skip to main content
Mastra v1 is coming in January 2026. Get ahead by starting new projects with the beta or upgrade your existing project today.

CompositeVoice

The CompositeVoice class allows you to combine different voice providers for text-to-speech and speech-to-text operations. This is particularly useful when you want to use the best provider for each operation - for example, using OpenAI for speech-to-text and PlayAI for text-to-speech.

CompositeVoice supports both Mastra voice providers and AI SDK model providers

Constructor ParametersDirect link to Constructor Parameters

config:

object
Configuration object for the composite voice service

config.input?:

MastraVoice | TranscriptionModel
Voice provider or AI SDK transcription model to use for speech-to-text operations. AI SDK models are automatically wrapped.

config.output?:

MastraVoice | SpeechModel
Voice provider or AI SDK speech model to use for text-to-speech operations. AI SDK models are automatically wrapped.

config.realtime?:

MastraVoice
Voice provider to use for real-time speech-to-speech operations

MethodsDirect link to Methods

speak()Direct link to speak()

Converts text to speech using the configured speaking provider.

input:

string | NodeJS.ReadableStream
Text to convert to speech

options?:

object
Provider-specific options passed to the speaking provider

Notes:

  • If no speaking provider is configured, this method will throw an error
  • Options are passed through to the configured speaking provider
  • Returns a stream of audio data

listen()Direct link to listen()

Converts speech to text using the configured listening provider.

audioStream:

NodeJS.ReadableStream
Audio stream to convert to text

options?:

object
Provider-specific options passed to the listening provider

Notes:

  • If no listening provider is configured, this method will throw an error
  • Options are passed through to the configured listening provider
  • Returns either a string or a stream of transcribed text, depending on the provider

getSpeakers()Direct link to getSpeakers()

Returns a list of available voices from the speaking provider, where each node contains:

voiceId:

string
Unique identifier for the voice

key?:

value
Additional voice properties that vary by provider (e.g., name, language)

Notes:

  • Returns voices from the speaking provider only
  • If no speaking provider is configured, returns an empty array
  • Each voice object will have at least a voiceId property
  • Additional voice properties depend on the speaking provider

Usage ExamplesDirect link to Usage Examples

Using Mastra Voice ProvidersDirect link to Using Mastra Voice Providers

import { CompositeVoice } from "@mastra/core/voice";
import { OpenAIVoice } from "@mastra/voice-openai";
import { PlayAIVoice } from "@mastra/voice-playai";

// Create voice providers
const openai = new OpenAIVoice();
const playai = new PlayAIVoice();

// Use OpenAI for listening (speech-to-text) and PlayAI for speaking (text-to-speech)
const voice = new CompositeVoice({
input: openai,
output: playai,
});

// Convert speech to text using OpenAI
const text = await voice.listen(audioStream);

// Convert text to speech using PlayAI
const audio = await voice.speak("Hello, world!");

Using AI SDK Model ProvidersDirect link to Using AI SDK Model Providers

You can pass AI SDK transcription and speech models directly to CompositeVoice:

import { CompositeVoice } from "@mastra/core/voice";
import { openai } from "@ai-sdk/openai";
import { elevenlabs } from "@ai-sdk/elevenlabs";

// Use AI SDK models directly - they will be auto-wrapped
const voice = new CompositeVoice({
input: openai.transcription('whisper-1'), // AI SDK transcription
output: elevenlabs.speech('eleven_turbo_v2'), // AI SDK speech
});

// Works the same way as with Mastra providers
const text = await voice.listen(audioStream);
const audio = await voice.speak("Hello from AI SDK!");

Mix and MatchDirect link to Mix and Match

You can combine Mastra providers with AI SDK models:

import { CompositeVoice } from "@mastra/core/voice";
import { PlayAIVoice } from "@mastra/voice-playai";
import { groq } from "@ai-sdk/groq";

const voice = new CompositeVoice({
input: groq.transcription('whisper-large-v3'), // AI SDK for STT
output: new PlayAIVoice(), // Mastra for TTS
});