Memory Class Reference
The Memory
class provides a robust system for managing conversation history and thread-based message storage in Mastra. It enables persistent storage of conversations, semantic search capabilities, and efficient message retrieval. You must configure a storage provider for conversation history, and if you enable semantic recall you will also need to provide a vector store and embedder.
Basic Usage
import { Memory } from "@mastra/memory";
import { Agent } from "@mastra/core/agent";
const agent = new Agent({
memory: new Memory(),
...otherOptions,
});
Custom Configuration
import { Memory } from "@mastra/memory";
import { LibSQLStore, LibSQLVector } from "@mastra/libsql";
import { Agent } from "@mastra/core/agent";
const memory = new Memory({
// Optional storage configuration - libsql will be used by default
storage: new LibSQLStore({
url: "file:./memory.db",
}),
// Optional vector database for semantic search
vector: new LibSQLVector({
url: "file:./vector.db",
}),
// Memory configuration options
options: {
// Number of recent messages to include
lastMessages: 20,
// Semantic search configuration
semanticRecall: {
topK: 3, // Number of similar messages to retrieve
messageRange: {
// Messages to include around each result
before: 2,
after: 1,
},
},
// Working memory configuration
workingMemory: {
enabled: true,
template: `
# User
- First Name:
- Last Name:
`,
},
// Thread configuration
threads: {
generateTitle: true, // Enable title generation using agent's model
// Or use a different model for title generation
// generateTitle: {
// model: openai("gpt-4.1-nano"), // Use cheaper model for titles
// },
},
},
});
const agent = new Agent({
memory,
...otherOptions,
});
Working Memory
The working memory feature allows agents to maintain persistent information across conversations. When enabled, the Memory class automatically manages working memory updates using a dedicated tool call.
Example configuration:
const memory = new Memory({
options: {
workingMemory: {
enabled: true,
template: "# User\n- **First Name**:\n- **Last Name**:",
},
},
});
If no template is provided, the Memory class uses a default template that includes fields for user details, preferences, goals, and other contextual information in Markdown format. See the Working Memory guide for detailed usage examples and best practices.
Thread Title Generation
The generateTitle
feature automatically creates meaningful titles for conversation threads based on the user’s first message. This helps organize and identify conversations in your application.
Basic Usage
const memory = new Memory({
options: {
threads: {
generateTitle: true, // Use the agent's model for title generation
},
},
});
Cost Optimization with Custom Models
You can specify a different (typically cheaper) model for title generation while using a high-quality model for the main conversation:
import { openai } from "@ai-sdk/openai";
const memory = new Memory({
options: {
threads: {
generateTitle: {
model: openai("gpt-4.1-nano"), // Cheaper model for titles
},
},
},
});
const agent = new Agent({
model: openai("gpt-4o"), // High-quality model for main conversation
memory,
});
Dynamic Model Selection
You can also use a function to dynamically determine the model based on runtime context:
const memory = new Memory({
options: {
threads: {
generateTitle: {
model: (ctx: RuntimeContext) => {
// Use different models based on context
const userTier = ctx.get("userTier");
return userTier === "premium"
? openai("gpt-4.1")
: openai("gpt-4.1-nano");
},
},
},
},
});
embedder
An embedding model is required if semanticRecall
is enabled.
One option is to use @mastra/fastembed
, which provides an on-device/local embedding model using FastEmbed . This model runs locally and does not require API keys or network requests.
To use it, first install the package:
npm install @mastra/fastembed
Then, configure it in your Memory
instance:
import { Memory } from "@mastra/memory";
import { fastembed } from "@mastra/fastembed";
import { Agent } from "@mastra/core/agent";
const agent = new Agent({
memory: new Memory({
embedder: fastembed,
// ... other memory config
}),
});
Note that, depending on where you’re deploying your project, your project may not deploy due to FastEmbeds large internal dependencies.
Alternatively, you can use an API-based embedder like OpenAI (which doesn’t have this problem):
import { Memory } from "@mastra/memory";
import { openai } from "@ai-sdk/openai";
import { Agent } from "@mastra/core/agent";
const agent = new Agent({
memory: new Memory({
embedder: openai.embedding("text-embedding-3-small"),
}),
});
Mastra supports many embedding models through the Vercel AI SDK , including options from OpenAI, Google, Mistral, and Cohere.