Answer Similarity Scorer
The createAnswerSimilarityScorer() function creates a scorer that evaluates how similar an agent's output is to a ground truth answer. This scorer is specifically designed for CI/CD testing scenarios where you have expected answers and want to ensure consistency over time.
ParametersDirect link to Parameters
model:
options:
AnswerSimilarityOptionsDirect link to AnswerSimilarityOptions
requireGroundTruth:
semanticThreshold:
exactMatchBonus:
missingPenalty:
contradictionPenalty:
extraInfoPenalty:
scale:
This function returns an instance of the MastraScorer class. The .run() method accepts the same input as other scorers (see the MastraScorer reference), but requires ground truth to be provided in the run object.
.run() ReturnsDirect link to .run() Returns
runId:
score:
reason:
preprocessStepResult:
analyzeStepResult:
preprocessPrompt:
analyzePrompt:
generateReasonPrompt:
Scoring DetailsDirect link to Scoring Details
The scorer uses a multi-step process:
- Extract: Breaks down output and ground truth into semantic units
- Analyze: Compares units and identifies matches, contradictions, and gaps
- Score: Calculates weighted similarity with penalties for contradictions
- Reason: Generates human-readable explanation
Score calculation: max(0, base_score - contradiction_penalty - missing_penalty - extra_info_penalty) × scale
ExampleDirect link to Example
Evaluate agent responses for similarity to ground truth across different scenarios:
import { runExperiment } from "@mastra/core/scores";
import { createAnswerSimilarityScorer } from "@mastra/evals/scorers/llm";
import { myAgent } from "./agent";
const scorer = createAnswerSimilarityScorer({ model: "openai/gpt-4o" });
const result = await runExperiment({
data: [
{
input: "What is 2+2?",
groundTruth: "4",
},
{
input: "What is the capital of France?",
groundTruth: "The capital of France is Paris",
},
{
input: "What are the primary colors?",
groundTruth: "The primary colors are red, blue, and yellow",
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.name].score,
reason: scorerResults[scorer.name].reason,
});
},
});
console.log(result.scores);
For more details on runExperiment, see the runExperiment reference.
To add this scorer to an agent, see the Scorers overview guide.