ExamplesMemoryStreaming Working Memory (quickstart)

Streaming Working Memory

This example demonstrates how to create an agent that maintains a working memory for relevant conversational details like the users name, location, or preferences.

Setup

First, set up the memory system with working memory enabled. Memory uses LibSQL storage by default, but you can use any other storage provider if needed:

Text Stream Mode (Default)

import { Memory } from "@mastra/memory";
 
const memory = new Memory({
  options: {
    workingMemory: {
      enabled: true,
      use: 'text-stream', // this is the default mode
    },
  },
});

Tool Call Mode

Alternatively, you can use tool calls for working memory updates. This mode is required when using toDataStream() as text-stream mode is not compatible with data streaming:

const toolCallMemory = new Memory({
  options: {
    workingMemory: {
      enabled: true,
      use: 'tool-call', // Required for toDataStream() compatibility
    },
  },
});

Add the memory instance to an agent:

import { openai } from "@ai-sdk/openai";
 
const agent = new Agent({
  name: "Memory agent",
  instructions: "You are a helpful AI assistant.",
  model: openai("gpt-4o-mini"),
  memory, // or toolCallMemory
});

Usage Example

Now that working memory is set up you can interact with the agent and it will remember key details about interactions.

Text Stream Mode

In text stream mode, the agent includes working memory updates directly in its responses:

import { randomUUID } from "crypto";
import { maskStreamTags } from "@mastra/core/utils";
 
const threadId = randomUUID();
const resourceId = "SOME_USER_ID";
 
const response = await agent.stream("Hello, my name is Jane", {
  threadId,
  resourceId,
});
 
// Process response stream, hiding working memory tags
for await (const chunk of maskStreamTags(response.textStream, "working_memory")) {
  process.stdout.write(chunk);
}

Tool Call Mode

In tool call mode, the agent uses a dedicated tool to update working memory:

const toolCallResponse = await toolCallAgent.stream("Hello, my name is Jane", {
  threadId,
  resourceId,
});
 
// No need to mask working memory tags since updates happen through tool calls
for await (const chunk of toolCallResponse.textStream) {
  process.stdout.write(chunk);
}

Handling response data

In text stream mode, the response stream will contain xml-like <working_memory>$data</working_memory> tagged data. Mastra picks up these tags and automatically updates working memory with the data returned by the LLM.

To prevent showing this data to users you can use the maskStreamTags util as shown above.

In tool call mode, working memory updates happen through tool calls, so there’s no need to mask any tags.

Summary

This example demonstrates:

  1. Setting up memory with working memory enabled in either text-stream or tool-call mode
  2. Using maskStreamTags to hide memory updates in text-stream mode
  3. The agent maintaining relevant user info between interactions in both modes
  4. Different approaches to handling working memory updates

Advanced use cases

For examples on controlling which information is relevant for working memory, or showing loading states while working memory is being saved, see our advanced working memory example.

To learn more about agent memory, including other memory types and storage options, check out the Memory documentation page.