Skip to Content
ReferenceVoiceCloudflare

Cloudflare

The CloudflareVoice class in Mastra provides text-to-speech capabilities using Cloudflare Workers AI. This provider specializes in efficient, low-latency speech synthesis suitable for edge computing environments.

Usage Example

import { CloudflareVoice } from '@mastra/voice-cloudflare'; // Initialize with configuration const voice = new CloudflareVoice({ speechModel: { name: '@cf/meta/m2m100-1.2b', apiKey: 'your-cloudflare-api-token', accountId: 'your-cloudflare-account-id' }, speaker: 'en-US-1' // Default voice }); // Convert text to speech const audioStream = await voice.speak('Hello, how can I help you?', { speaker: 'en-US-2', // Override default voice }); // Get available voices const speakers = await voice.getSpeakers(); console.log(speakers);

Configuration

Constructor Options

speechModel?:

CloudflareSpeechConfig
Configuration for text-to-speech synthesis.

speaker?:

string
= 'en-US-1'
Default voice ID for speech synthesis.

CloudflareSpeechConfig

name?:

string
= '@cf/meta/m2m100-1.2b'
Model name to use for TTS.

apiKey?:

string
Cloudflare API token with Workers AI access. Falls back to CLOUDFLARE_API_TOKEN environment variable.

accountId?:

string
Cloudflare account ID. Falls back to CLOUDFLARE_ACCOUNT_ID environment variable.

Methods

speak()

Converts text to speech using Cloudflare’s text-to-speech service.

input:

string | NodeJS.ReadableStream
Text or text stream to convert to speech.

options.speaker?:

string
= Constructor's speaker value
Voice ID to use for speech synthesis.

options.format?:

string
= 'mp3'
Output audio format.

Returns: Promise<NodeJS.ReadableStream>

getSpeakers()

Returns an array of available voice options, where each node contains:

voiceId:

string
Unique identifier for the voice (e.g., 'en-US-1')

language:

string
Language code of the voice (e.g., 'en-US')

Notes

  • API tokens can be provided via constructor options or environment variables (CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID)
  • Cloudflare Workers AI is optimized for edge computing with low latency
  • This provider only supports text-to-speech (TTS) functionality, not speech-to-text (STT)
  • The service integrates well with other Cloudflare Workers products
  • For production use, ensure your Cloudflare account has the appropriate Workers AI subscription
  • Voice options are more limited compared to some other providers, but performance at the edge is excellent

If you need speech-to-text capabilities in addition to text-to-speech, consider using one of these providers:

  • OpenAI - Provides both TTS and STT
  • Google - Provides both TTS and STT
  • Azure - Provides both TTS and STT