Skip to Content
DocsMemoryMemory Processors

Memory Processors

Memory Processors allow you to modify the list of messages retrieved from memory before they are added to the agent’s context window and sent to the LLM. This is useful for managing context size, filtering content, and optimizing performance.

Processors operate on the messages retrieved based on your memory configuration (e.g., lastMessages, semanticRecall). They do not affect the new incoming user message.

Built-in Processors

Mastra provides built-in processors:

TokenLimiter

This processor is used to prevent errors caused by exceeding the LLM’s context window limit. It counts the tokens in the retrieved memory messages and removes the oldest messages until the total count is below the specified limit.

import { Memory } from "@mastra/memory"; import { TokenLimiter } from "@mastra/memory/processors"; import { Agent } from "@mastra/core"; import { openai } from "@ai-sdk/openai"; const agent = new Agent({ model: openai("gpt-4o"), memory: new Memory({ processors: [ // Ensure the total tokens from memory don't exceed ~127k new TokenLimiter(127000), ], }), });

The TokenLimiter uses the o200k_base encoding by default (suitable for GPT-4o). You can specify other encodings if needed for different models:

// Import the encoding you need (e.g., for older OpenAI models) import cl100k_base from "js-tiktoken/ranks/cl100k_base"; const memoryForOlderModel = new Memory({ processors: [ new TokenLimiter({ limit: 16000, // Example limit for a 16k context model encoding: cl100k_base, }), ], });

See the OpenAI cookbook or js-tiktoken repo for more on encodings.

ToolCallFilter

This processor removes tool calls from the memory messages sent to the LLM. It saves tokens by excluding potentially verbose tool interactions from the context, which is useful if the details aren’t needed for future interactions. It’s also useful if you always want your agent to call a specific tool again and not rely on previous tool results in memory.

import { Memory } from "@mastra/memory"; import { ToolCallFilter, TokenLimiter } from "@mastra/memory/processors"; const memoryFilteringTools = new Memory({ processors: [ // Example 1: Remove all tool calls/results new ToolCallFilter(), // Example 2: Remove only noisy image generation tool calls/results new ToolCallFilter({ exclude: ["generateImageTool"] }), // Always place TokenLimiter last new TokenLimiter(127000), ], });

Applying Multiple Processors

You can chain multiple processors. They execute in the order they appear in the processors array. The output of one processor becomes the input for the next.

Order matters! It’s generally best practice to place TokenLimiter last in the chain. This ensures it operates on the final set of messages after other filtering has occurred, providing the most accurate token limit enforcement.

import { Memory } from "@mastra/memory"; import { ToolCallFilter, TokenLimiter } from "@mastra/memory/processors"; // Assume a hypothetical 'PIIFilter' custom processor exists // import { PIIFilter } from './custom-processors'; const memoryWithMultipleProcessors = new Memory({ processors: [ // 1. Filter specific tool calls first new ToolCallFilter({ exclude: ["verboseDebugTool"] }), // 2. Apply custom filtering (e.g., remove hypothetical PII - use with caution) // new PIIFilter(), // 3. Apply token limiting as the final step new TokenLimiter(127000), ], });

Creating Custom Processors

You can create custom logic by extending the base MemoryProcessor class.

import { Memory, CoreMessage } from "@mastra/memory"; import { MemoryProcessor, MemoryProcessorOpts } from "@mastra/core/memory"; class ConversationOnlyFilter extends MemoryProcessor { constructor() { // Provide a name for easier debugging if needed super({ name: "ConversationOnlyFilter" }); } process( messages: CoreMessage[], _opts: MemoryProcessorOpts = {}, // Options passed during memory retrieval, rarely needed here ): CoreMessage[] { // Filter messages based on role return messages.filter( (msg) => msg.role === "user" || msg.role === "assistant", ); } } // Use the custom processor const memoryWithCustomFilter = new Memory({ processors: [ new ConversationOnlyFilter(), new TokenLimiter(127000), // Still apply token limiting ], });

When creating custom processors avoid mutating the input messages array or its objects directly.