Streaming Working Memory
This example demonstrates how to create an agent that maintains a working memory for relevant conversational details like the users name, location, or preferences.
Setup
First, set up the memory system with working memory enabled. Memory uses LibSQL storage by default, but you can use any other storage provider if needed:
Text Stream Mode (Default)
import { Memory } from "@mastra/memory";
const memory = new Memory({
options: {
workingMemory: {
enabled: true,
use: 'text-stream', // this is the default mode
},
},
});
Tool Call Mode
Alternatively, you can use tool calls for working memory updates. This mode is required when using toDataStream()
as text-stream mode is not compatible with data streaming:
const toolCallMemory = new Memory({
options: {
workingMemory: {
enabled: true,
use: 'tool-call', // Required for toDataStream() compatibility
},
},
});
Add the memory instance to an agent:
import { openai } from "@ai-sdk/openai";
const agent = new Agent({
name: "Memory agent",
instructions: "You are a helpful AI assistant.",
model: openai("gpt-4o-mini"),
memory, // or toolCallMemory
});
Usage Example
Now that working memory is set up you can interact with the agent and it will remember key details about interactions.
Text Stream Mode
In text stream mode, the agent includes working memory updates directly in its responses:
import { randomUUID } from "crypto";
import { maskStreamTags } from "@mastra/core/utils";
const threadId = randomUUID();
const resourceId = "SOME_USER_ID";
const response = await agent.stream("Hello, my name is Jane", {
threadId,
resourceId,
});
// Process response stream, hiding working memory tags
for await (const chunk of maskStreamTags(response.textStream, "working_memory")) {
process.stdout.write(chunk);
}
Tool Call Mode
In tool call mode, the agent uses a dedicated tool to update working memory:
const toolCallResponse = await toolCallAgent.stream("Hello, my name is Jane", {
threadId,
resourceId,
});
// No need to mask working memory tags since updates happen through tool calls
for await (const chunk of toolCallResponse.textStream) {
process.stdout.write(chunk);
}
Handling response data
In text stream mode, the response stream will contain xml-like <working_memory>$data</working_memory>
tagged data.
Mastra picks up these tags and automatically updates working memory with the data returned by the LLM.
To prevent showing this data to users you can use the maskStreamTags
util as shown above.
In tool call mode, working memory updates happen through tool calls, so there’s no need to mask any tags.
Summary
This example demonstrates:
- Setting up memory with working memory enabled in either text-stream or tool-call mode
- Using
maskStreamTags
to hide memory updates in text-stream mode - The agent maintaining relevant user info between interactions in both modes
- Different approaches to handling working memory updates
Advanced use cases
For examples on controlling which information is relevant for working memory, or showing loading states while working memory is being saved, see our advanced working memory example.
To learn more about agent memory, including other memory types and storage options, check out the Memory documentation page.