Skip to Content
ReferenceCLImastra scorers

mastra scorers

The mastra scorers command provides management capabilities for evaluation scorers that measure the quality, accuracy, and performance of AI-generated outputs.

Usage

mastra scorers <command> [options]

Commands

mastra scorers add

Add a new scorer template to your project.

mastra scorers add [scorer-name] [options]

Options

--dir?:

string
Path to your Mastra directory (default: auto-detect)

--help?:

boolean
Display help for command

Examples

Add a specific scorer by name:

mastra scorers add answer-relevancy

Interactive scorer selection (when no name provided):

mastra scorers add

Add scorer to custom directory:

mastra scorers add toxicity-detection --dir ./custom/scorers

mastra scorers list

List all available scorer templates.

mastra scorers list

This command displays built-in scorer templates organized by category:

  • Accuracy and Reliability: answer-relevancy, bias-detection, faithfulness, hallucination, toxicity-detection
  • Output Quality: completeness, content-similarity, keyword-coverage, textual-difference, tone-consistency

Available Scorers

When running mastra scorers add without specifying a scorer name, you can select from these built-in templates:

Accuracy and Reliability

  • answer-relevancy: Evaluates how relevant an AI response is to the input question
  • bias-detection: Identifies potential biases in AI-generated content
  • faithfulness: Measures how faithful the response is to provided context
  • hallucination: Detects when AI generates information not grounded in the input
  • toxicity-detection: Identifies harmful or inappropriate content

Output Quality

  • completeness: Assesses whether the response fully addresses the input
  • content-similarity: Measures semantic similarity between expected and actual outputs
  • keyword-coverage: Evaluates coverage of expected keywords or topics
  • textual-difference: Measures textual differences between responses
  • tone-consistency: Evaluates consistency of tone and style

What It Does

  1. Dependency Management: Automatically installs @mastra/evals package if needed
  2. Template Selection: Provides interactive selection when no scorer specified
  3. File Generation: Creates scorer files from built-in templates
  4. Directory Structure: Places scorers in src/mastra/scorers/ or custom directory
  5. Duplicate Detection: Prevents overwriting existing scorer files

Integration

After adding scorers, integrate them with your agents or workflows:

With Agents

src/mastra/agents/evaluated-agent.ts
import { Agent } from "@mastra/core/agent"; import { openai } from "@ai-sdk/openai"; import { createAnswerRelevancyScorer } from "../scorers/answer-relevancy-scorer"; export const evaluatedAgent = new Agent({ // ... other config scorers: { relevancy: { scorer: createAnswerRelevancyScorer({ model: openai("gpt-4o-mini") }), sampling: { type: "ratio", rate: 0.5 } } } });

With Workflow Steps

src/mastra/workflows/content-generation.ts
import { createWorkflow, createStep } from "@mastra/core/workflows"; import { customStepScorer } from "../scorers/custom-step-scorer"; const contentStep = createStep({ // ... other config scorers: { customStepScorer: { scorer: customStepScorer(), sampling: { type: "ratio", rate: 1 } } }, });

Testing Scorers

Use the Local Dev Playground to test your scorers:

mastra dev

Navigate to http://localhost:4111/  and access the scorers section to run individual scorers against test inputs and view detailed results.

Next Steps