ExamplesVoiceText to Speech

Interactive Story Generator

The following code snippets provide example implementations of Text-to-Speech (TTS) functionality in an interactive story generator application using Next.js with Mastra as a separate backend integration. This example demonstrates how to use the Mastra client-js SDK to connect to your Mastra backend. For more details on integrating Mastra with Next.js, please refer to our Integrate with Next.js documentation.

Creating an Agent with TTS Capabilities

The following example shows how to set up a story generator agent with TTS capabilities on the backend:

src/mastra/agents/index.ts
import { openai } from '@ai-sdk/openai';
import { Agent } from '@mastra/core/agent';
import { OpenAIVoice } from '@mastra/voice-openai';
import { Memory } from '@mastra/memory';
 
const instructions = `
    You are an Interactive Storyteller Agent. Your job is to create engaging
    short stories with user choices that influence the narrative. // omitted for brevity
`;
 
export const storyTellerAgent = new Agent({`
  name: 'Story Teller Agent',
  instructions: instructions,
  model: openai('gpt-4o'),
  voice: new OpenAIVoice(),
});

Registering the Agent with Mastra

This snippet demonstrates how to register the agent with your Mastra instance:

src/mastra/index.ts
import { createLogger } from '@mastra/core/logger';
import { Mastra } from '@mastra/core/mastra';
import { storyTellerAgent } from './agents';
 
export const mastra = new Mastra({
  agents: { storyTellerAgent },
  logger: createLogger({
    name: 'Mastra',
    level: 'info',
  }),
});

Connecting to Mastra from the Frontend

Here we use the Mastra Client SDK to interact with our Mastra server. For more information about the Mastra Client SDK, check out the documentation.

src/app/page.tsx
import { MastraClient } from '@mastra/client-js';
 
export const mastraClient = new MastraClient({
  baseUrl: 'http://localhost:4111', // Replace with your Mastra backend URL
});

Generating Story Content and Converting to Speech

This example demonstrates how to get a reference to a Mastra agent, generate story content based on user input, and then convert that content to speech:

/app/components/StoryManager.tsx
const handleInitialSubmit = async (formData: FormData) => {
  setIsLoading(true);
  try {
    const agent = mastraClient.getAgent('storyTellerAgent');
    const message = `Current phase: BEGINNING. Story genre: ${formData.genre}, Protagonist name: ${formData.protagonistDetails.name}, Protagonist age: ${formData.protagonistDetails.age}, Protagonist gender: ${formData.protagonistDetails.gender}, Protagonist occupation: ${formData.protagonistDetails.occupation}, Story Setting: ${formData.setting}`;
    const storyResponse = await agent.generate({
      messages: [{ role: 'user', content: message }],
      threadId: storyState.threadId,
      resourceId: storyState.resourceId,
    });
 
    const storyText = storyResponse.text;
 
    const audioResponse = await agent.voice.speak(storyText);
 
    if (!audioResponse.body) {
      throw new Error('No audio stream received');
    }
 
    const audio = await readStream(audioResponse.body);
 
    setStoryState(prev => ({
      phase: 'beginning',
      threadId: prev.threadId,
      resourceId: prev.resourceId,
      content: {
        ...prev.content,
        beginning: storyText,
      },
    }));
 
    setAudioBlob(audio);
    return audio;
  } catch (error) {
    console.error('Error generating story beginning:', error);
  } finally {
    setIsLoading(false);
  }
};

Playing the Audio

This snippet demonstrates how to handle text-to-speech audio playback by monitoring for new audio data. When audio is received, the code creates a browser-playable URL from the audio blob, assigns it to an audio element, and attempts to play it automatically:

/app/components/StoryManager.tsx
useEffect(() => {
  if (!audioRef.current || !audioData) return;
 
  // Store a reference to the HTML audio element
  const currentAudio = audioRef.current;
 
  // Convert the Blob/File audio data from Mastra into a URL the browser can play
  const url = URL.createObjectURL(audioData);
 
  const playAudio = async () => {
    try {
      currentAudio.src = url;
      await currentAudio.load();
      await currentAudio.play();
      setIsPlaying(true);
    } catch (error) {
      console.error('Auto-play failed:', error);
    }
  };
 
  playAudio();
 
  return () => {
    if (currentAudio) {
      currentAudio.pause();
      currentAudio.src = '';
      URL.revokeObjectURL(url);
    }
  };
}, [audioData]);

You can view the complete implementation of the Interactive Story Generator on our GitHub repository.






View Example on GitHub