ElevenLabs

The ElevenLabs voice implementation in Mastra provides high-quality text-to-speech (TTS) and speech-to-text (STT) capabilities using the ElevenLabs API.

Usage example
Direct link to Usage example

import { ElevenLabsVoice } from '@mastra/voice-elevenlabs'

// Initialize with default configuration (uses ELEVENLABS_API_KEY environment variable)
const voice = new ElevenLabsVoice()

// Initialize with custom configuration
const voice = new ElevenLabsVoice({
  speechModel: {
    name: 'eleven_multilingual_v2',
    apiKey: 'your-api-key',
  },
  speaker: 'custom-speaker-id',
})

// Text-to-Speech
const audioStream = await voice.speak('Hello, world!')

// Get available speakers
const speakers = await voice.getSpeakers()

Constructor parameters
Direct link to Constructor parameters

speechModel?:

ElevenLabsVoiceConfig

= { name: 'eleven_multilingual_v2' }

Configuration for text-to-speech functionality.

ElevenLabsVoiceConfig

name?:

ElevenLabsModel

The ElevenLabs model to use

apiKey?:

string

ElevenLabs API key. Falls back to ELEVENLABS_API_KEY environment variable

speaker?:

string

= '9BWtsMINqrJLrRacOk9x' (Aria voice)

ID of the speaker to use for text-to-speech

Methods
Direct link to Methods

`speak()`
Direct link to speak

Converts text to speech using the configured speech model and voice.

input:

string | NodeJS.ReadableStream

Text to convert to speech. If a stream is provided, it will be converted to text first.

options?:

object

Additional options for speech synthesis

object

speaker?:

string

Override the default speaker ID for this request

Returns: Promise<NodeJS.ReadableStream>

`getSpeakers()`
Direct link to getspeakers

Returns an array of available voice options, where each node contains:

voiceId:

string

Unique identifier for the voice

name:

string

Display name of the voice

language:

string

Language code for the voice

gender:

string

Gender of the voice

`listen()`
Direct link to listen

Converts audio input to text using ElevenLabs Speech-to-Text API.

input:

NodeJS.ReadableStream

A readable stream containing the audio data to transcribe

options?:

object

Configuration options for the transcription

The options object supports the following properties:

language_code?:

string

ISO language code (e.g., 'en', 'fr', 'es')

tag_audio_events?:

boolean

Whether to tag audio events like [MUSIC], [LAUGHTER], etc.

num_speakers?:

number

Number of speakers to detect in the audio

filetype?:

string

Audio file format (e.g., 'mp3', 'wav', 'ogg')

timeoutInSeconds?:

number

Request timeout in seconds

maxRetries?:

number

Maximum number of retry attempts

abortSignal?:

AbortSignal

Signal to abort the request

Returns: Promise<string> - A Promise that resolves to the transcribed text

Important notes
Direct link to Important notes

An ElevenLabs API key is required. Set it via the ELEVENLABS_API_KEY environment variable or pass it in the constructor.
The default speaker is set to Aria (ID: '9BWtsMINqrJLrRacOk9x').
Speech-to-text functionality isn't supported by ElevenLabs.
Available speakers can be retrieved using the getSpeakers() method, which returns detailed information about each voice including language and gender.

Usage exampleDirect link to Usage example

Constructor parametersDirect link to Constructor parameters

speechModel?:

name?:

apiKey?:

speaker?:

MethodsDirect link to Methods

speak()Direct link to speak

input:

options?:

speaker?:

getSpeakers()Direct link to getspeakers

voiceId:

name:

language:

gender:

listen()Direct link to listen

input:

options?:

language_code?:

tag_audio_events?:

num_speakers?:

filetype?:

timeoutInSeconds?:

maxRetries?:

abortSignal?:

Important notesDirect link to Important notes

Usage example
Direct link to Usage example

Constructor parameters
Direct link to Constructor parameters

Methods
Direct link to Methods

`speak()`
Direct link to speak

`getSpeakers()`
Direct link to getspeakers

`listen()`
Direct link to listen

Important notes
Direct link to Important notes