DocsReferenceVoiceMastra Voice

MastraVoice

The MastraVoice class is an abstract base class that defines the core interface for voice services in Mastra. All voice provider implementations (like OpenAI, Deepgram, PlayAI, Speechify) extend this class to provide their specific functionality.

Usage Example

import { MastraVoice } from "@mastra/core/voice";
 
// Create a voice provider implementation
class MyVoiceProvider extends MastraVoice {
  constructor(config: { speechModel?: ModelConfig; listeningModel?: ModelConfig; speaker?: string }) {
    super({
      speechModel: config.speechModel,
      listeningModel: config.listeningModel,
      speaker: config.speaker
    });
  }
 
  // Implement required abstract methods
  async speak(input: string | NodeJS.ReadableStream, options?: { speaker?: string }): Promise<NodeJS.ReadableStream> {
    // Implement text-to-speech conversion
  }
 
  async listen(audioStream: NodeJS.ReadableStream, options?: any): Promise<string | NodeJS.ReadableStream> {
    // Implement speech-to-text conversion
  }
 
  async getSpeakers(): Promise<Array<{ voiceId: string }>> {
    // Return list of available voices
  }
}

Constructor Parameters

config:

object
Configuration object for the voice service

config.speechModel?:

BuiltInModelConfig
Configuration for the text-to-speech model

config.listeningModel?:

BuiltInModelConfig
Configuration for the speech-to-text model

config.speaker?:

string
Default speaker/voice ID to use

BuiltInModelConfig

name:

string
Name of the model to use

apiKey?:

string
API key for the model service

Abstract Methods

These methods must be implemented by any class extending MastraVoice.

speak()

Converts text to speech using the configured speech model.

abstract speak(
  input: string | NodeJS.ReadableStream,
  options?: {
    speaker?: string;
    [key: string]: any;
  }
): Promise<NodeJS.ReadableStream>

Purpose

  • Takes text input and converts it to speech using the provider’s text-to-speech service
  • Supports both string and stream input for flexibility
  • Allows overriding the default speaker/voice through options
  • Returns a stream of audio data that can be played or saved

listen()

Converts speech to text using the configured listening model.

abstract listen(
  audioStream: NodeJS.ReadableStream,
  options?: {
    [key: string]: any;
  }
): Promise<string | NodeJS.ReadableStream>

Purpose

  • Takes an audio stream and converts it to text using the provider’s speech-to-text service
  • Supports provider-specific options for transcription configuration
  • Can return either a complete text transcription or a stream of transcribed text
  • Not all providers support this functionality (e.g., PlayAI, Speechify)

getSpeakers()

Returns a list of available voices supported by the provider.

abstract getSpeakers(): Promise<Array<{ voiceId: string; [key: string]: any }>>

Purpose

  • Retrieves the list of available voices/speakers from the provider
  • Each voice must have at least a voiceId property
  • Providers can include additional metadata about each voice
  • Used to discover available voices for text-to-speech conversion

Protected Properties

speechModel?:

BuiltInModelConfig | undefined
Configuration for the text-to-speech model

listeningModel?:

BuiltInModelConfig | undefined
Configuration for the speech-to-text model

speaker?:

string | undefined
Default speaker/voice ID

Telemetry Support

MastraVoice includes built-in telemetry support through the traced method, which wraps method calls with performance tracking and error monitoring.

Notes

  • MastraVoice is an abstract class and cannot be instantiated directly
  • Implementations must provide concrete implementations for all abstract methods
  • The class provides a consistent interface across different voice service providers
  • Configuration and authentication details are provider-specific
  • Telemetry is automatically handled for all method calls