Skip to main content
Mastra v1 is coming in January 2026. Get ahead by starting new projects with the beta or upgrade your existing project today.

ElevenLabs

The ElevenLabs voice implementation in Mastra provides high-quality text-to-speech (TTS) and speech-to-text (STT) capabilities using the ElevenLabs API.

Usage ExampleDirect link to Usage Example

import { ElevenLabsVoice } from "@mastra/voice-elevenlabs";

// Initialize with default configuration (uses ELEVENLABS_API_KEY environment variable)
const voice = new ElevenLabsVoice();

// Initialize with custom configuration
const voice = new ElevenLabsVoice({
speechModel: {
name: "eleven_multilingual_v2",
apiKey: "your-api-key",
},
speaker: "custom-speaker-id",
});

// Text-to-Speech
const audioStream = await voice.speak("Hello, world!");

// Get available speakers
const speakers = await voice.getSpeakers();

Constructor ParametersDirect link to Constructor Parameters

speechModel?:

ElevenLabsVoiceConfig
= { name: 'eleven_multilingual_v2' }
Configuration for text-to-speech functionality.

speaker?:

string
= '9BWtsMINqrJLrRacOk9x' (Aria voice)
ID of the speaker to use for text-to-speech

ElevenLabsVoiceConfigDirect link to ElevenLabsVoiceConfig

name?:

ElevenLabsModel
= 'eleven_multilingual_v2'
The ElevenLabs model to use

apiKey?:

string
ElevenLabs API key. Falls back to ELEVENLABS_API_KEY environment variable

MethodsDirect link to Methods

speak()Direct link to speak()

Converts text to speech using the configured speech model and voice.

input:

string | NodeJS.ReadableStream
Text to convert to speech. If a stream is provided, it will be converted to text first.

options?:

object
Additional options for speech synthesis

options.speaker?:

string
Override the default speaker ID for this request

Returns: Promise<NodeJS.ReadableStream>

getSpeakers()Direct link to getSpeakers()

Returns an array of available voice options, where each node contains:

voiceId:

string
Unique identifier for the voice

name:

string
Display name of the voice

language:

string
Language code for the voice

gender:

string
Gender of the voice

listen()Direct link to listen()

Converts audio input to text using ElevenLabs Speech-to-Text API.

input:

NodeJS.ReadableStream
A readable stream containing the audio data to transcribe

options?:

object
Configuration options for the transcription

The options object supports the following properties:

language_code?:

string
ISO language code (e.g., 'en', 'fr', 'es')

tag_audio_events?:

boolean
Whether to tag audio events like [MUSIC], [LAUGHTER], etc.

num_speakers?:

number
Number of speakers to detect in the audio

filetype?:

string
Audio file format (e.g., 'mp3', 'wav', 'ogg')

timeoutInSeconds?:

number
Request timeout in seconds

maxRetries?:

number
Maximum number of retry attempts

abortSignal?:

AbortSignal
Signal to abort the request

Returns: Promise<string> - A Promise that resolves to the transcribed text

Important NotesDirect link to Important Notes

  1. An ElevenLabs API key is required. Set it via the ELEVENLABS_API_KEY environment variable or pass it in the constructor.
  2. The default speaker is set to Aria (ID: '9BWtsMINqrJLrRacOk9x').
  3. Speech-to-text functionality is not supported by ElevenLabs.
  4. Available speakers can be retrieved using the getSpeakers() method, which returns detailed information about each voice including language and gender.