Memory Class Reference

The Memory class provides a robust system for managing conversation history and thread-based message storage in Mastra. It enables persistent storage of conversations, semantic search capabilities, and efficient message retrieval. You must configure a storage provider for conversation history, and if you enable semantic recall you will also need to provide a vector store and embedder.

Basic Usage


import { Memory } from "@mastra/memory";
import { Agent } from "@mastra/core/agent";
 
const agent = new Agent({
  memory: new Memory(),
  ...otherOptions,
});

Custom Configuration


import { Memory } from "@mastra/memory";
import { LibSQLStore, LibSQLVector } from "@mastra/libsql";
import { Agent } from "@mastra/core/agent";
 
const memory = new Memory({
  // Optional storage configuration - libsql will be used by default
  storage: new LibSQLStore({
    url: "file:./memory.db",
  }),
 
  // Optional vector database for semantic search
  vector: new LibSQLVector({
    url: "file:./vector.db",
  }),
 
  // Memory configuration options
  options: {
    // Number of recent messages to include
    lastMessages: 20,
 
    // Semantic search configuration
    semanticRecall: {
      topK: 3, // Number of similar messages to retrieve
      messageRange: {
        // Messages to include around each result
        before: 2,
        after: 1,
      },
    },
 
    // Working memory configuration
    workingMemory: {
      enabled: true,
      template: `
# User
- First Name:
- Last Name:
`,
    },
 
    // Thread configuration
    threads: {
      generateTitle: true, // Enable title generation using agent's model
      // Or use a different model for title generation
      // generateTitle: {
      //   model: openai("gpt-4.1-nano"), // Use cheaper model for titles
      //   instructions: "Generate a concise title based on the initial user message.", // Custom instructions for title
      // },
    },
  },
});
 
const agent = new Agent({
  memory,
  ...otherOptions,
});

Working Memory

The working memory feature allows agents to maintain persistent information across conversations. When enabled, the Memory class automatically manages working memory updates using a dedicated tool call.

Example configuration:


const memory = new Memory({
  options: {
    workingMemory: {
      enabled: true,
      template: "# User\n- **First Name**:\n- **Last Name**:",
    },
  },
});

If no template is provided, the Memory class uses a default template that includes fields for user details, preferences, goals, and other contextual information in Markdown format. See the Working Memory guide for detailed usage examples and best practices.

Thread Title Generation

The generateTitle feature automatically creates meaningful titles for conversation threads based on the user’s first message. This helps organize and identify conversations in your application.

Basic Usage


const memory = new Memory({
  options: {
    threads: {
      generateTitle: true, // Use the agent's model for title generation
    },
  },
});

Cost Optimization with Custom Models and Instructions

You can specify a different (typically cheaper) model and custom instructions for title generation while using a high-quality model for the main conversation:


import { openai } from "@ai-sdk/openai";
 
const memory = new Memory({
  options: {
    threads: {
      generateTitle: {
        model: openai("gpt-4.1-nano"), // Cheaper model for titles
        instructions: "Generate a concise, friendly title based on the initial user message.", // Custom title instructions
      },
    },
  },
});
 
const agent = new Agent({
  model: openai("gpt-4o"), // High-quality model for main conversation
  memory,
});

Dynamic Model Selection and Instructions

You can also use a function to dynamically determine the model and instructions based on runtime context:


const memory = new Memory({
  options: {
    threads: {
      generateTitle: {
        model: (ctx: RuntimeContext) => {
          // Use different models based on context
          const userTier = ctx.get("userTier");
          return userTier === "premium" 
            ? openai("gpt-4.1")
            : openai("gpt-4.1-nano");
        },
        instructions: (ctx: RuntimeContext) => {
          const language = ctx.get("userLanguage") || "English";
          return `Generate a concise, engaging title in ${language} based on the user's first message.`;
        },
      },
    },
  },
});

embedder

An embedding model is required if semanticRecall is enabled.

One option is to use @mastra/fastembed, which provides an on-device/local embedding model using FastEmbed . This model runs locally and does not require API keys or network requests.

To use it, first install the package:


npm install @mastra/fastembed


pnpm add @mastra/fastembed


yarn add @mastra/fastembed


bun add @mastra/fastembed

Then, configure it in your Memory instance:


import { Memory } from "@mastra/memory";
import { fastembed } from "@mastra/fastembed";
import { Agent } from "@mastra/core/agent";
 
const agent = new Agent({
  memory: new Memory({
    embedder: fastembed,
    // ... other memory config
  }),
});

Note that, depending on where you’re deploying your project, your project may not deploy due to FastEmbeds large internal dependencies.

Alternatively, you can use an API-based embedder like OpenAI (which doesn’t have this problem):


import { Memory } from "@mastra/memory";
import { openai } from "@ai-sdk/openai";
import { Agent } from "@mastra/core/agent";
 
const agent = new Agent({
  memory: new Memory({
    embedder: openai.embedding("text-embedding-3-small"),
  }),
});

Mastra supports many embedding models through the Vercel AI SDK , including options from OpenAI, Google, Mistral, and Cohere.

Parameters

storage?:

MastraStorage

Storage implementation for persisting memory data

vector?:

MastraVector

Vector store for semantic search capabilities

embedder?:

EmbeddingModel

Embedder instance for vector embeddings. Required when semantic recall is enabled

options?:

MemoryConfig

General memory configuration options

options

lastMessages?:

number | false

= 10

Number of most recent messages to retrieve. Set to false to disable.

semanticRecall?:

boolean | SemanticRecallConfig

= false

Enable semantic search in message history. Automatically enabled when vector store is provided.

topK?:

number

= 2

Number of similar messages to retrieve when using semantic search

messageRange?:

number | { before: number; after: number }

= 2

Range of messages to include around semantic search results

scope?:

'thread' | 'resource'

= 'thread'

Scope for semantic search. 'thread' searches within the current thread only (default). 'resource' searches across all threads for a given resourceId, allowing agents to recall information from any of the user's past conversations. The 'resource' scope is currently supported by LibSQL, Postgres, and Upstash storage adapters.

workingMemory?:

{ enabled: boolean; template?: string }

= { enabled: false, template: '# User Information\n- **First Name**:\n- **Last Name**:\n...' }

Configuration for working memory feature that allows persistent storage of user information across conversations. Working memory uses Markdown format to structure and store continuously relevant information.

threads?:

{ generateTitle?: boolean | { model: MastraLanguageModel | ((ctx: RuntimeContext) => MastraLanguageModel | Promise<MastraLanguageModel>), instructions?: string | ((ctx: RuntimeContext) => string | Promise<string>) } }

= { generateTitle: false }

Settings related to memory thread creation. `generateTitle` controls automatic thread title generation from the user's first message. Can be a boolean to enable/disable using the agent's model, or an object specifying a custom model or custom instructions for title generation (useful for cost optimization or title customization). Example: { generateTitle: { model: openai('gpt-4.1-nano'), instructions: 'Concise title based on the initial user message.' } }

Memory Class Reference

Basic Usage

Custom Configuration

Working Memory

Thread Title Generation

Basic Usage

Cost Optimization with Custom Models and Instructions

Dynamic Model Selection and Instructions

embedder

Parameters

storage?:

vector?:

embedder?:

options?:

options

lastMessages?:

semanticRecall?:

topK?:

messageRange?:

scope?:

workingMemory?:

threads?:

Related