Memory Processors
This example demonstrates how to use memory processors to limit token usage, filter out tool calls, and create a simple custom processor.
Setup
First, install the memory package:
npm install @mastra/memory
# or
pnpm add @mastra/memory
# or
yarn add @mastra/memory
Basic Memory Setup with Processors
import { Memory } from "@mastra/memory";
import { TokenLimiter, ToolCallFilter } from "@mastra/memory/processors";
// Create memory with processors
const memory = new Memory({
processors: [new TokenLimiter(127000), new ToolCallFilter()],
});
Using Token Limiting
The TokenLimiter
helps you stay within your model’s context window:
import { Memory } from "@mastra/memory";
import { TokenLimiter } from "@mastra/memory/processors";
// Set up memory with a token limit
const memory = new Memory({
processors: [
// Limit to approximately 12700 tokens (for GPT-4o)
new TokenLimiter(127000),
],
});
You can also specify a different encoding if needed:
import { Memory } from "@mastra/memory";
import { TokenLimiter } from "@mastra/memory/processors";
import cl100k_base from "js-tiktoken/ranks/cl100k_base";
const memory = new Memory({
processors: [
new TokenLimiter({
limit: 16000,
encoding: cl100k_base, // Specific encoding for certain models eg GPT-3.5
}),
],
});
Filtering Tool Calls
The ToolCallFilter
processor removes tool calls and their results from memory:
import { Memory } from "@mastra/memory";
import { ToolCallFilter } from "@mastra/memory/processors";
// Filter out all tool calls
const memoryNoTools = new Memory({
processors: [new ToolCallFilter()],
});
// Filter specific tool calls
const memorySelectiveFilter = new Memory({
processors: [
new ToolCallFilter({
exclude: ["imageGenTool", "clipboardTool"],
}),
],
});
Combining Multiple Processors
Processors run in the order they are defined:
import { Memory } from "@mastra/memory";
import { TokenLimiter, ToolCallFilter } from "@mastra/memory/processors";
const memory = new Memory({
processors: [
// First filter out tool calls
new ToolCallFilter({ exclude: ["imageGenTool"] }),
// Then limit tokens (always put token limiter last for accurate measuring after other filters/transforms)
new TokenLimiter(16000),
],
});
Creating a Simple Custom Processor
You can create your own processors by extending the MemoryProcessor
class:
import type { CoreMessage } from "@mastra/core";
import { MemoryProcessor } from "@mastra/core/memory";
import { Memory } from "@mastra/memory";
// Simple processor that keeps only the most recent messages
class RecentMessagesProcessor extends MemoryProcessor {
private limit: number;
constructor(limit: number = 10) {
super();
this.limit = limit;
}
process(messages: CoreMessage[]): CoreMessage[] {
// Keep only the most recent messages
return messages.slice(-this.limit);
}
}
// Use the custom processor
const memory = new Memory({
processors: [
new RecentMessagesProcessor(5), // Keep only the last 5 messages
new TokenLimiter(16000),
],
});
Note: this example is for simplicity of understanding how custom processors work - you can limit messages more efficiently using new Memory({ options: { lastMessages: 5 } })
. Memory processors are applied after memories are retrieved from storage, while options.lastMessages
is applied before messages are fetched from storage.
Integration with an Agent
Here’s how to use memory with processors in an agent:
import { Agent } from "@mastra/core";
import { Memory, TokenLimiter, ToolCallFilter } from "@mastra/memory";
import { openai } from "@ai-sdk/openai";
// Set up memory with processors
const memory = new Memory({
processors: [
new ToolCallFilter({ exclude: ["debugTool"] }),
new TokenLimiter(16000),
],
});
// Create an agent with the memory
const agent = new Agent({
name: "ProcessorAgent",
instructions: "You are a helpful assistant with processed memory.",
model: openai("gpt-4o-mini"),
memory,
});
// Use the agent
const response = await agent.stream("Hi, can you remember our conversation?", {
threadId: "unique-thread-id",
resourceId: "user-123",
});
for await (const chunk of response.textStream) {
process.stdout.write(chunk);
}
Summary
This example demonstrates:
- Setting up memory with token limiting to prevent context window overflow
- Filtering out tool calls to reduce noise and token usage
- Creating a simple custom processor to keep only recent messages
- Combining multiple processors in the correct order
- Integrating processed memory with an agent
For more details on memory processors, check out the Memory Processors documentation.