Memory Processors

Memory Processors allow you to modify the list of messages retrieved from memory before they are added to the agent’s context window and sent to the LLM. This is useful for managing context size, filtering content, and optimizing performance.

Processors operate on the messages retrieved based on your memory configuration (e.g., lastMessages, semanticRecall). They do not affect the new incoming user message.

Built-in Processors

Mastra provides built-in processors:

`TokenLimiter`

This processor is used to prevent errors caused by exceeding the LLM’s context window limit. It counts the tokens in the retrieved memory messages and removes the oldest messages until the total count is below the specified limit.


import { Memory } from "@mastra/memory";
import { TokenLimiter } from "@mastra/memory/processors";
import { Agent } from "@mastra/core/agent";
import { openai } from "@ai-sdk/openai";
 
const agent = new Agent({
  model: openai("gpt-4o"),
  memory: new Memory({
    processors: [
      // Ensure the total tokens from memory don't exceed ~127k
      new TokenLimiter(127000),
    ],
  }),
});

The TokenLimiter uses the o200k_base encoding by default (suitable for GPT-4o). You can specify other encodings if needed for different models:


// Import the encoding you need (e.g., for older OpenAI models)
import cl100k_base from "js-tiktoken/ranks/cl100k_base";
 
const memoryForOlderModel = new Memory({
  processors: [
    new TokenLimiter({
      limit: 16000, // Example limit for a 16k context model
      encoding: cl100k_base,
    }),
  ],
});

See the OpenAI cookbook or js-tiktoken repo for more on encodings.

`ToolCallFilter`

This processor removes tool calls from the memory messages sent to the LLM. It saves tokens by excluding potentially verbose tool interactions from the context, which is useful if the details aren’t needed for future interactions. It’s also useful if you always want your agent to call a specific tool again and not rely on previous tool results in memory.


import { Memory } from "@mastra/memory";
import { ToolCallFilter, TokenLimiter } from "@mastra/memory/processors";
 
const memoryFilteringTools = new Memory({
  processors: [
    // Example 1: Remove all tool calls/results
    new ToolCallFilter(),
 
    // Example 2: Remove only noisy image generation tool calls/results
    new ToolCallFilter({ exclude: ["generateImageTool"] }),
 
    // Always place TokenLimiter last
    new TokenLimiter(127000),
  ],
});

Applying Multiple Processors

You can chain multiple processors. They execute in the order they appear in the processors array. The output of one processor becomes the input for the next.

Order matters! It’s generally best practice to place TokenLimiter last in the chain. This ensures it operates on the final set of messages after other filtering has occurred, providing the most accurate token limit enforcement.


import { Memory } from "@mastra/memory";
import { ToolCallFilter, TokenLimiter } from "@mastra/memory/processors";
// Assume a hypothetical 'PIIFilter' custom processor exists
// import { PIIFilter } from './custom-processors';
 
const memoryWithMultipleProcessors = new Memory({
  processors: [
    // 1. Filter specific tool calls first
    new ToolCallFilter({ exclude: ["verboseDebugTool"] }),
    // 2. Apply custom filtering (e.g., remove hypothetical PII - use with caution)
    // new PIIFilter(),
    // 3. Apply token limiting as the final step
    new TokenLimiter(127000),
  ],
});

Creating Custom Processors

You can create custom logic by extending the base MemoryProcessor class.


import { Memory, CoreMessage } from "@mastra/memory";
import { MemoryProcessor, MemoryProcessorOpts } from "@mastra/core/memory";
 
class ConversationOnlyFilter extends MemoryProcessor {
  constructor() {
    // Provide a name for easier debugging if needed
    super({ name: "ConversationOnlyFilter" });
  }
 
  process(
    messages: CoreMessage[],
    _opts: MemoryProcessorOpts = {}, // Options passed during memory retrieval, rarely needed here
  ): CoreMessage[] {
    // Filter messages based on role
    return messages.filter(
      (msg) => msg.role === "user" || msg.role === "assistant",
    );
  }
}
 
// Use the custom processor
const memoryWithCustomFilter = new Memory({
  processors: [
    new ConversationOnlyFilter(),
    new TokenLimiter(127000), // Still apply token limiting
  ],
});

When creating custom processors avoid mutating the input messages array or its objects directly.