Using Mastra's Agent Memory API

To work effectively, AI agents need the right context from previous interactions. But the right context is often hard to define programmatically! This is where agent memory comes in.

Implementing a memory system for agents usually involves:

Persisting memory in backend storage
Storing and retrieving conversation history
Finding relevant context from past interactions using semantic search

So we built a Memory API for agents in Mastra. This includes:

Conversation context management
Semantic search over past interactions
Storage backend abstraction
Thread sharing between agents

This post will walk you through implementing these capabilities in your application, from local development to production deployment.

Implementation Guide

Let's build an agent with memory capabilities. Mastra's Memory API comes with sensible defaults that work out of the box, while still being configurable when needed.

Prerequisites:

npm install @mastra/core @mastra/memory

Setting Up Memory

The default configuration initializes a LibSQL database for memory retrieval, storage, and semantic search and uses the fastembed-js library to download a local model for on-device embedding.

Because of this, you can initialize a Memory instance without any configuration:

 1import { Memory } from "@mastra/memory";
 2
 3const memory = new Memory();

Now let's create an agent and attach memory to it:

 1import { Agent } from "@mastra/core";
 2import { Memory } from "@mastra/memory";
 3
 4const memory = new Memory();
 5
 6const agent = new Agent({
 7  memory,
 8  // Additional agent configuration options
 9});

That's it! You've built an agent with a memory system that includes:

Persistent storage via LibSQL
Built-in vector search and embedding capabilities via fastembed-js
Automatic context management

For custom storage backends, vector databases, or embedding models, check out the Memory Configuration Guide.

Here's a reasonable set of parameters for message retrieval and search that you might give an agent. It includes both a number of recent messages, as well as older messages that are semantically similar, as well as the context surrounding those messages.

 1const memory = new Memory({
 2  options: {
 3    lastMessages: 100, // Number of recent messages to include
 4    semanticRecall: {
 5      topK: 2, // Number of similar messages to retrieve
 6      messageRange: 2, // Number of messages before/after each result
 7    },
 8  },
 9});

Advanced Memory Usage

Using Resource and Threads

Note that when making requests, you can specify some global parameters for the agent, if you want to attach it onto resources or threads that are available to other agents:

resourceId: Identifier for the user/entity making the request
threadId: Identifier for the conversation thread

 1await agent.stream("Message text", {
 2  resourceId: "user_123",
 3  threadId: "thread_123",
 4});

Overriding Memory Parameters

While an agent has a default memory configuration, you can override those parameters for an individual request:

 1await agent.stream("Message text", {
 2  memoryOptions: {
 3    lastMessages: 10,
 4    semanticRecall: {
 5      topK: 3,
 6      messageRange: 5,
 7    },
 8  },
 9});

Summary

The Memory API implements these core functions:

Message Storage: Stores messages and metadata in SQL tables, indexed by thread and resource IDs
Vector Search: Converts messages to embeddings for similarity search
Context Management: Retrieves recent messages and semantically similar historical context
Thread Sharing: Enables multiple agents to access shared conversation threads