Memory overview
Memory in Mastra helps agents manage context across conversations by condensing relevant information into the language model’s context window.
Mastra supports three complementary memory systems: working memory, conversation history, and semantic recall. Together, they allow agents to track preferences, maintain conversational flow, and retrieve relevant historical messages.
To persist and recall information between conversations, memory requires a storage adapter.
Supported options include:
Types of memory
All memory types are thread-scoped by default, meaning they apply only to a single conversation. Resource-scoped configuration allows working memory and semantic recall to persist across all threads that use the same user or entity.
Working memory
Stores persistent user-specific details such as names, preferences, goals, and other structured data. Uses Markdown templates or Zod schemas to define structure.
Conversation history
Captures recent messages from the current conversation, providing short-term continuity and maintaining dialogue flow.
Semantic recall
Retrieves older messages from past conversations based on semantic relevance. Matches are retrieved using vector search and can include surrounding context for better comprehension.
How memory works together
Mastra combines all memory types into a single context window. If the total exceeds the model’s token limit, use memory processors to trim or filter messages before sending them to the model.
Getting started
To use memory, install the required dependencies:
npm install @mastra/core @mastra/memory @mastra/libsql
Shared storage
To share memory across agents, add a storage adapter to the main Mastra instance. Any agent with memory enabled will use this shared storage to store and recall interactions.
import { Mastra } from "@mastra/core/mastra";
import { LibSQLStore } from "@mastra/libsql";
export const mastra = new Mastra({
// ...
storage: new LibSQLStore({
url: ":memory:"
})
});
Adding working memory to agents
Enable working memory by passing a Memory
instance to the agent’s memory
parameter and setting workingMemory.enabled
to true
:
import { Memory } from "@mastra/memory";
import { Agent } from "@mastra/core/agent";
export const testAgent = new Agent({
// ..
memory: new Memory({
options: {
workingMemory: {
enabled: true
}
}
})
})
Dedicated storage
Agents can be configured with their own dedicated storage, keeping tasks, conversations, and recalled information separate across agents.
Adding storage to agents
To assign dedicated storage to an agent, install and import the required dependency and pass a storage
instance to the Memory
constructor:
import { Memory } from "@mastra/memory";
import { Agent } from "@mastra/core/agent";
import { LibSQLStore } from "@mastra/libsql";
export const testAgent = new Agent({
// ...
memory: new Memory({
// ...
storage: new LibSQLStore({
url: "file:agent-memory.db"
})
// ...
})
});
Memory threads
Mastra organizes memory into threads, which are records that group related interactions, using two identifiers:
thread
: A globally unique ID representing the conversation (e.g.,support_123
). Must be unique across all resources.resource
: The user or entity that owns the thread (e.g.,user_123
,org_456
).
The resource
is especially important for resource-scoped memory, which allows memory to persist across all threads associated with the same user or entity.
const stream = await agent.stream("message for agent", {
memory: {
thread: "user-123",
resource: "test-123"
}
});
Even with memory configured, agents won’t store or recall information unless both thread
and resource
are provided.
Mastra Playground sets
thread
andresource
IDs automatically. In your own application, you must provide them manually as part of each.generate()
or.stream()
call.
Thread title generation
Mastra can automatically generate descriptive thread titles based on the user’s first message. Enable this by setting generateTitle
to true
. This improves organization and makes it easier to display conversations in your UI.
export const testAgent = new Agent({
memory: new Memory({
options: {
threads: {
generateTitle: true,
}
},
})
});
Title generation runs asynchronously after the agent responds and does not affect response time. See the full configuration reference for details and examples.
Optimizing title generation
Titles are generated using your agent’s model by default. To optimize cost or behavior, provide a smaller model
and custom instructions
. This keeps title generation separate from main conversation logic.
export const testAgent = new Agent({
// ...
memory: new Memory({
options: {
threads: {
generateTitle: {
model: openai("gpt-4.1-nano"),
instructions: "Generate a concise title based on the user's first message",
},
},
}
})
});
Dynamic model selection and instructions
You can configure thread title generation dynamically by passing functions to model
and instructions
. These functions receive the runtimeContext
object, allowing you to adapt title generation based on user-specific values.
export const testAgent = new Agent({
// ...
memory: new Memory({
options: {
threads: {
generateTitle: {
model: ({ runtimeContext }) => {
const userTier = runtimeContext.get("userTier");
return userTier === "premium" ? openai("gpt-4.1") : openai("gpt-4.1-nano");
},
instructions: ({ runtimeContext }) => {
const language = runtimeContext.get("userLanguage") || "English";
return `Generate a concise, engaging title in ${language} based on the user's first message.`;
}
}
}
}
})
});
Increasing conversation history
By default, each request includes the last 10 messages from the current memory thread, giving the agent short-term conversational context. This limit can be increased using the lastMessages
parameter.
export const testAgent = new Agent({
// ...
memory: new Memory({
options: {
lastMessages: 100
},
})
});
Viewing retrieved messages
If tracing is enabled in your Mastra deployment and memory is configured either with lastMessages
and/or semanticRecall
, the agent’s trace output will show all messages retrieved for context—including both recent conversation history and messages recalled via semantic recall.
This is helpful for debugging, understanding agent decisions, and verifying that the agent is retrieving the right information for each request.
For more details on enabling and configuring tracing, see Tracing.
Local development with LibSQL
For local development with LibSQLStore
, you can inspect stored memory using the SQLite Viewer extension in VS Code.
Next Steps
Now that you understand the core concepts, continue to semantic recall to learn how to add RAG memory to your Mastra agents.
Alternatively you can visit the configuration reference for available options.