# Answer Similarity Scorer The `createAnswerSimilarityScorer()` function creates a scorer that evaluates how similar an agent's output is to a ground truth answer. This scorer is specifically designed for CI/CD testing scenarios where you have expected answers and want to ensure consistency over time. ## Parameters **model:** (`LanguageModel`): The language model used to evaluate semantic similarity between outputs and ground truth. **options:** (`AnswerSimilarityOptions`): Configuration options for the scorer. ### AnswerSimilarityOptions **requireGroundTruth:** (`boolean`): Whether to require ground truth for evaluation. If false, missing ground truth returns score 0. (Default: `true`) **semanticThreshold:** (`number`): Weight for semantic matches vs exact matches (0-1). (Default: `0.8`) **exactMatchBonus:** (`number`): Additional score bonus for exact matches (0-1). (Default: `0.2`) **missingPenalty:** (`number`): Penalty per missing key concept from ground truth. (Default: `0.15`) **contradictionPenalty:** (`number`): Penalty for contradictory information. High value ensures wrong answers score near 0. (Default: `1.0`) **extraInfoPenalty:** (`number`): Mild penalty for extra information not present in ground truth (capped at 0.2). (Default: `0.05`) **scale:** (`number`): Score scaling factor. (Default: `1`) This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer)), but **requires ground truth** to be provided in the run object. ## .run() Returns **runId:** (`string`): The id of the run (optional). **score:** (`number`): Similarity score between 0-1 (or 0-scale if custom scale used). Higher scores indicate better similarity to ground truth. **reason:** (`string`): Human-readable explanation of the score with actionable feedback. **preprocessStepResult:** (`object`): Extracted semantic units from output and ground truth. **analyzeStepResult:** (`object`): Detailed analysis of matches, contradictions, and extra information. **preprocessPrompt:** (`string`): The prompt used for semantic unit extraction. **analyzePrompt:** (`string`): The prompt used for similarity analysis. **generateReasonPrompt:** (`string`): The prompt used for generating the explanation. ## Scoring Details The scorer uses a multi-step process: 1. **Extract**: Breaks down output and ground truth into semantic units 2. **Analyze**: Compares units and identifies matches, contradictions, and gaps 3. **Score**: Calculates weighted similarity with penalties for contradictions 4. **Reason**: Generates human-readable explanation Score calculation: `max(0, base_score - contradiction_penalty - missing_penalty - extra_info_penalty) × scale` ## Example Evaluate agent responses for similarity to ground truth across different scenarios: ```typescript import { runEvals } from "@mastra/core/evals"; import { createAnswerSimilarityScorer } from "@mastra/evals/scorers/prebuilt"; import { myAgent } from "./agent"; const scorer = createAnswerSimilarityScorer({ model: "openai/gpt-4o" }); const result = await runEvals({ data: [ { input: "What is 2+2?", groundTruth: "4", }, { input: "What is the capital of France?", groundTruth: "The capital of France is Paris", }, { input: "What are the primary colors?", groundTruth: "The primary colors are red, blue, and yellow", }, ], scorers: [scorer], target: myAgent, onItemComplete: ({ scorerResults }) => { console.log({ score: scorerResults[scorer.id].score, reason: scorerResults[scorer.id].reason, }); }, }); console.log(result.scores); ``` For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals). To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview) guide.