ModerationProcessor
The ModerationProcessor is a hybrid processor that can be used for both input and output processing to provide content moderation using an LLM to detect inappropriate content across multiple categories. This processor helps maintain content safety by evaluating messages against configurable moderation categories with flexible strategies for handling flagged content.
Usage example
import { ModerationProcessor } from "@mastra/core/processors";
const processor = new ModerationProcessor({
model: "openai/gpt-4.1-nano",
threshold: 0.7,
strategy: "block",
categories: ["hate", "harassment", "violence"]
});
Constructor parameters
options:
Options
Configuration options for content moderation
Options
model:
MastraModelConfig
Model configuration for the moderation agent
categories?:
string[]
Categories to check for moderation. If not specified, uses default OpenAI categories
threshold?:
number
Confidence threshold for flagging (0-1). Content is flagged if any category score exceeds this threshold
strategy?:
'block' | 'warn' | 'filter'
Strategy when content is flagged: 'block' rejects with error, 'warn' logs warning but allows through, 'filter' removes flagged messages
instructions?:
string
Custom moderation instructions for the agent. If not provided, uses default instructions based on categories
includeScores?:
boolean
Whether to include confidence scores in logs. Useful for tuning thresholds and debugging
chunkWindow?:
number
Number of previous chunks to include for context when moderating stream chunks. If set to 1, includes the previous part, etc.
Returns
name:
string
Processor name set to 'moderation'
processInput:
(args: { messages: MastraMessageV2[]; abort: (reason?: string) => never; tracingContext?: TracingContext }) => Promise<MastraMessageV2[]>
Processes input messages to moderate content before sending to LLM
processOutputStream:
(args: { part: ChunkType; streamParts: ChunkType[]; state: Record<string, any>; abort: (reason?: string) => never; tracingContext?: TracingContext }) => Promise<ChunkType | null | undefined>
Processes streaming output parts to moderate content during streaming
Extended usage example
Input processing
src/mastra/agents/moderated-agent.ts
import { Agent } from "@mastra/core/agent";
import { ModerationProcessor } from "@mastra/core/processors";
export const agent = new Agent({
name: "moderated-agent",
instructions: "You are a helpful assistant",
model: "openai/gpt-4o-mini",
inputProcessors: [
new ModerationProcessor({
model: "openai/gpt-4.1-nano",
categories: ["hate", "harassment", "violence"],
threshold: 0.7,
strategy: "block",
instructions: "Detect and flag inappropriate content in user messages",
includeScores: true
})
]
});
Output processing with batching
When using ModerationProcessor as an output processor, it's recommended to combine it with BatchPartsProcessor to optimize performance. The BatchPartsProcessor batches stream chunks together before passing them to the moderator, reducing the number of LLM calls required for moderation.
src/mastra/agents/output-moderated-agent.ts
import { Agent } from "@mastra/core/agent";
import { BatchPartsProcessor, ModerationProcessor } from "@mastra/core/processors";
export const agent = new Agent({
name: "output-moderated-agent",
instructions: "You are a helpful assistant",
model: "openai/gpt-4o-mini",
outputProcessors: [
// Batch stream parts first to reduce LLM calls
new BatchPartsProcessor({
batchSize: 10,
}),
// Then apply moderation on batched content
new ModerationProcessor({
model: "openai/gpt-4.1-nano",
strategy: "filter",
chunkWindow: 1,
}),
]
});