Blog

Using Mastra's Agent Memory API

Feb 4, 2025

To work effectively, AI agents need the right context from previous interactions. But the right context is often hard to define programmatically! This is where agent memory comes in.

Implementing a memory system for agents usually involves:

  • Persisting memory in backend storage
  • Storing and retrieving conversation history
  • Finding relevant context from past interactions using semantic search

So we built a Memory API for agents in Mastra. This includes:

  • Conversation context management
  • Semantic search over past interactions
  • Storage backend abstraction
  • Thread sharing between agents

This post will walk you through implementing these capabilities in your application, from local development to production deployment.

Implementation Guide

We'll build an agent with memory capabilities using default for both storage and vector search in a single database. The default db (LibSQL) runs locally without Docker and can be deployed to the cloud too.

Prerequisites:

npm install @mastra/core @mastra/memory

Configure Storage

Let's start by setting up storage for conversation history. This will handle messages, metadata, and thread management:

import { DefaultStore } from "@mastra/core/storage";
import { Memory } from "@mastra/memory";

const storage = new DefaultStore({
  config: {
    url: "file:example.db",
  },
});

const memory = new Memory({
  storage,
});

The Memory API works with multiple storage backends: default (local SQLite), Postgres, and Upstash. For storage configuration details: https://mastra.ai/docs/agents/01-agent-memory#storage-options

Now, let's add vector search to find relevant messages based on semantic similarity. This will let you find relevant past messages based on meaning rather than just keywords. You'll need to use an embedding model; we'll use OpenAI's text-embedding-3-small.

import { DefaultVectorDB } from "@mastra/core/storage";

const vector = new DefaultVectorDB({
  connectionUrl: process.env.DATABASE_URL || "file:example.db",
});

const memory = new Memory({
  storage,
  vector,
  // Required when using vector search
  embedding: {
    provider: "OPEN_AI",
    model: "text-embedding-3-small",
    maxRetries: 3,
  },
});

Mastra supports multiple vector databases, including Pinecone, Pgvector, Qdrant, Chroma, Astra, and Vectorize.

Note that storage and vector databases can be used independently. Eg, you could use default storage (LibSQL) with Pinecone vector search.

For vector database configuration: https://mastra.ai/docs/rag/vector-databases

Configure Memory Parameters

Here's a reasonable set of parameters for message retrieval and search that you might give an agent. It includes both a number of recent messages, as well as older messages that are semantically similar, as well as the context surrounding those messages.

const memory = new Memory({
  storage,
  vector,
  options: {
    lastMessages: 100, // Number of recent messages to include
    semanticRecall: {
      topK: 2, // Number of similar messages to retrieve
      messageRange: 2, // Number of messages before/after each result
    },
  },
});

Now let's create an agent with the memory configuration:

import { Agent } from "@mastra/core";
import { memory } from "./memory";

const agent = new Agent({
  memory,
  // Additional configuration options
});

Congrats! You've built an agent with memory.

Advanced Memory Usage


Using Resource and Threads

Note that when making requests, you can specify some global parameters for the agent, if you want to attach it onto resources or threads that are available to other agents:

  • resourceId: Identifier for the user/entity making the request
  • threadId: Identifier for the conversation thread
await agent.stream("Message text", {
  resourceId: "user_123",
  threadId: "thread_123",
});

Overriding Memory Parameters

While an agent has a default memory configuration, you can override those parameters for an individual request:

await agent.stream("Message text", {
  memoryOptions: {
    lastMessages: 10,
    semanticRecall: {
      topK: 3,
      messageRange: 5,
    },
  },
});

Summary

The Memory API implements these core functions:

  1. Message Storage: Stores messages and metadata in SQL tables, indexed by thread and resource IDs
  2. Vector Search: Converts messages to embeddings for similarity search
  3. Context Management: Retrieves recent messages and semantically similar historical context
  4. Thread Sharing: Enables multiple agents to access shared conversation threads

Share

Stay up to date