Skip to Content
ExamplesMemoryMemory Processors

Memory Processors

This example demonstrates how to use memory processors to limit token usage, filter out tool calls, and create a simple custom processor.

Setup

First, install the memory package:

npm install @mastra/memory # or pnpm add @mastra/memory # or yarn add @mastra/memory

Basic Memory Setup with Processors

import { Memory } from "@mastra/memory"; import { TokenLimiter, ToolCallFilter } from "@mastra/memory/processors"; // Create memory with processors const memory = new Memory({ processors: [new TokenLimiter(127000), new ToolCallFilter()], });

Using Token Limiting

The TokenLimiter helps you stay within your model’s context window:

import { Memory } from "@mastra/memory"; import { TokenLimiter } from "@mastra/memory/processors"; // Set up memory with a token limit const memory = new Memory({ processors: [ // Limit to approximately 12700 tokens (for GPT-4o) new TokenLimiter(127000), ], });

You can also specify a different encoding if needed:

import { Memory } from "@mastra/memory"; import { TokenLimiter } from "@mastra/memory/processors"; import cl100k_base from "js-tiktoken/ranks/cl100k_base"; const memory = new Memory({ processors: [ new TokenLimiter({ limit: 16000, encoding: cl100k_base, // Specific encoding for certain models eg GPT-3.5 }), ], });

Filtering Tool Calls

The ToolCallFilter processor removes tool calls and their results from memory:

import { Memory } from "@mastra/memory"; import { ToolCallFilter } from "@mastra/memory/processors"; // Filter out all tool calls const memoryNoTools = new Memory({ processors: [new ToolCallFilter()], }); // Filter specific tool calls const memorySelectiveFilter = new Memory({ processors: [ new ToolCallFilter({ exclude: ["imageGenTool", "clipboardTool"], }), ], });

Combining Multiple Processors

Processors run in the order they are defined:

import { Memory } from "@mastra/memory"; import { TokenLimiter, ToolCallFilter } from "@mastra/memory/processors"; const memory = new Memory({ processors: [ // First filter out tool calls new ToolCallFilter({ exclude: ["imageGenTool"] }), // Then limit tokens (always put token limiter last for accurate measuring after other filters/transforms) new TokenLimiter(16000), ], });

Creating a Simple Custom Processor

You can create your own processors by extending the MemoryProcessor class:

import type { CoreMessage } from "@mastra/core"; import { MemoryProcessor } from "@mastra/core/memory"; import { Memory } from "@mastra/memory"; // Simple processor that keeps only the most recent messages class RecentMessagesProcessor extends MemoryProcessor { private limit: number; constructor(limit: number = 10) { super(); this.limit = limit; } process(messages: CoreMessage[]): CoreMessage[] { // Keep only the most recent messages return messages.slice(-this.limit); } } // Use the custom processor const memory = new Memory({ processors: [ new RecentMessagesProcessor(5), // Keep only the last 5 messages new TokenLimiter(16000), ], });

Note: this example is for simplicity of understanding how custom processors work - you can limit messages more efficiently using new Memory({ options: { lastMessages: 5 } }). Memory processors are applied after memories are retrieved from storage, while options.lastMessages is applied before messages are fetched from storage.

Integration with an Agent

Here’s how to use memory with processors in an agent:

import { Agent } from "@mastra/core"; import { Memory, TokenLimiter, ToolCallFilter } from "@mastra/memory"; import { openai } from "@ai-sdk/openai"; // Set up memory with processors const memory = new Memory({ processors: [ new ToolCallFilter({ exclude: ["debugTool"] }), new TokenLimiter(16000), ], }); // Create an agent with the memory const agent = new Agent({ name: "ProcessorAgent", instructions: "You are a helpful assistant with processed memory.", model: openai("gpt-4o-mini"), memory, }); // Use the agent const response = await agent.stream("Hi, can you remember our conversation?", { threadId: "unique-thread-id", resourceId: "user-123", }); for await (const chunk of response.textStream) { process.stdout.write(chunk); }

Summary

This example demonstrates:

  1. Setting up memory with token limiting to prevent context window overflow
  2. Filtering out tool calls to reduce noise and token usage
  3. Creating a simple custom processor to keep only recent messages
  4. Combining multiple processors in the correct order
  5. Integrating processed memory with an agent

For more details on memory processors, check out the Memory Processors documentation.