createScorer
Mastra provides a unified createScorer factory that allows you to define custom scorers for evaluating input/output pairs. You can use either native JavaScript functions or LLM-based prompt objects for each evaluation step. Custom scorers can be added to Agents and Workflow steps.
How to Create a Custom Scorer
Use the createScorer factory to define your scorer with a name, description, and optional judge configuration. Then chain step methods to build your evaluation pipeline. You must provide at least a generateScore step.
const scorer = createScorer({
name: "My Custom Scorer",
description: "Evaluates responses based on custom criteria",
type: "agent", // Optional: for agent evaluation with automatic typing
judge: {
model: myModel,
instructions: "You are an expert evaluator...",
},
})
.preprocess({
/* step config */
})
.analyze({
/* step config */
})
.generateScore(({ run, results }) => {
// Return a number
})
.generateReason({
/* step config */
});
createScorer Options
name:
description:
judge:
type:
This function returns a scorer builder that you can chain step methods onto. See the MastraScorer reference for details on the .run() method and its input/output.
Judge Object
model:
instructions:
Type Safety
You can specify input/output types when creating scorers for better type inference and IntelliSense support:
Agent Type Shortcut
For evaluating agents, use type: 'agent' to automatically get the correct types for agent input/output:
import { createScorer } from "@mastra/core/scorers";
// Agent scorer with automatic typing
const agentScorer = createScorer({
name: "Agent Response Quality",
description: "Evaluates agent responses",
type: "agent", // Automatically provides ScorerRunInputForAgent/ScorerRunOutputForAgent
})
.preprocess(({ run }) => {
// run.input is automatically typed as ScorerRunInputForAgent
const userMessage = run.input.inputMessages[0]?.content;
return { userMessage };
})
.generateScore(({ run, results }) => {
// run.output is automatically typed as ScorerRunOutputForAgent
const response = run.output[0]?.content;
return response.length > 10 ? 1.0 : 0.5;
});
Custom Types with Generics
For custom input/output types, use the generic approach:
import { createScorer } from "@mastra/core/scorers";
type CustomInput = { query: string; context: string[] };
type CustomOutput = { answer: string; confidence: number };
const customScorer = createScorer<CustomInput, CustomOutput>({
name: "Custom Scorer",
description: "Evaluates custom data",
}).generateScore(({ run }) => {
// run.input is typed as CustomInput
// run.output is typed as CustomOutput
return run.output.confidence;
});
Built-in Agent Types
ScorerRunInputForAgent- ContainsinputMessages,rememberedMessages,systemMessages, andtaggedSystemMessagesfor agent evaluationScorerRunOutputForAgent- Array of agent response messages
Using these types provides autocomplete, compile-time validation, and better documentation for your scoring logic.
Trace Scoring with Agent Types
When you use type: 'agent', your scorer is compatible for both adding directly to agents and scoring traces from agent interactions. The scorer automatically transforms trace data into the proper agent input/output format:
const agentTraceScorer = createScorer({
name: "Agent Trace Length",
description: "Evaluates agent response length",
type: "agent",
}).generateScore(({ run }) => {
// Trace data is automatically transformed to agent format
const userMessages = run.input.inputMessages;
const agentResponse = run.output[0]?.content;
// Score based on response length
return agentResponse?.length > 50 ? 0 : 1;
});
// Register with Mastra for trace scoring
const mastra = new Mastra({
scorers: {
agentTraceScorer,
},
});
Step Method Signatures
preprocess
Optional preprocessing step that can extract or transform data before analysis.
Function Mode:
Function: ({ run, results }) => any
run.input:
run.output:
run.runId:
run.runtimeContext:
results:
Returns: any
The method can return any value. The returned value will be available to subsequent steps as preprocessStepResult.
Prompt Object Mode:
description:
outputSchema:
createPrompt:
judge:
analyze
Optional analysis step that processes the input/output and any preprocessed data.
Function Mode:
Function: ({ run, results }) => any
run.input:
run.output:
run.runId:
run.runtimeContext:
results.preprocessStepResult:
Returns: any
The method can return any value. The returned value will be available to subsequent steps as analyzeStepResult.
Prompt Object Mode:
description:
outputSchema:
createPrompt:
judge:
generateScore
Required step that computes the final numerical score.
Function Mode:
Function: ({ run, results }) => number
run.input:
run.output:
run.runId:
run.runtimeContext:
results.preprocessStepResult:
results.analyzeStepResult:
Returns: number
The method must return a numerical score.
Prompt Object Mode:
description:
outputSchema:
createPrompt:
judge:
When using prompt object mode, you must also provide a calculateScore function to convert the LLM output to a numerical score:
calculateScore:
generateReason
Optional step that provides an explanation for the score.
Function Mode:
Function: ({ run, results, score }) => string
run.input:
run.output:
run.runId:
run.runtimeContext:
results.preprocessStepResult:
results.analyzeStepResult:
score:
Returns: string
The method must return a string explaining the score.
Prompt Object Mode:
description:
createPrompt:
judge:
All step functions can be async.