Keyword Coverage Scorer

The createKeywordCoverageScorer() function evaluates how well an LLM's output covers the important keywords from the input. It analyzes keyword presence and matches while ignoring common words and stop words.

Parameters
Direct link to Parameters

The createKeywordCoverageScorer() function does not take any options.

This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.

.run() Returns
Direct link to .run() Returns

runId:

string

The id of the run (optional).

preprocessStepResult:

object

Object with extracted keywords: { referenceKeywords: Set<string>, responseKeywords: Set<string> }

analyzeStepResult:

object

Object with keyword coverage: { totalKeywords: number, matchedKeywords: number }

score:

number

Coverage score (0-1) representing the proportion of matched keywords.

.run() returns a result in the following shape:

{
  runId: string,
  extractStepResult: {
    referenceKeywords: Set<string>,
    responseKeywords: Set<string>
  },
  analyzeStepResult: {
    totalKeywords: number,
    matchedKeywords: number
  },
  score: number
}

Scoring Details
Direct link to Scoring Details

The scorer evaluates keyword coverage by matching keywords with the following features:

Common word and stop word filtering (e.g., "the", "a", "and")
Case-insensitive matching
Word form variation handling
Special handling of technical terms and compound words

Scoring Process
Direct link to Scoring Process

Processes keywords from input and output:
- Filters out common words and stop words
- Normalizes case and word forms
- Handles special terms and compounds
Calculates keyword coverage:
- Matches keywords between texts
- Counts successful matches
- Computes coverage ratio

Final score: (matched_keywords / total_keywords) * scale

Score interpretation
Direct link to Score interpretation

A coverage score between 0 and 1:

1.0: Complete coverage – all keywords present.
0.7–0.9: High coverage – most keywords included.
0.4–0.6: Partial coverage – some keywords present.
0.1–0.3: Low coverage – few keywords matched.
0.0: No coverage – no keywords found.

Special Cases
Direct link to Special Cases

The scorer handles several special cases:

Empty input/output: Returns score of 1.0 if both empty, 0.0 if only one is empty
Single word: Treated as a single keyword
Technical terms: Preserves compound technical terms (e.g., "React.js", "machine learning")
Case differences: "JavaScript" matches "javascript"
Common words: Ignored in scoring to focus on meaningful keywords

Example
Direct link to Example

Evaluate keyword coverage between input queries and agent responses:

src/example-keyword-coverage.ts
import { runEvals } from "@mastra/core/evals";
import { createKeywordCoverageScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createKeywordCoverageScorer();

const result = await runEvals({
  data: [
    {
      input: "JavaScript frameworks like React and Vue",
    },
    {
      input: "TypeScript offers interfaces, generics, and type inference",
    },
    {
      input:
        "Machine learning models require data preprocessing, feature engineering, and hyperparameter tuning",
    },
  ],
  scorers: [scorer],
  target: myAgent,
  onItemComplete: ({ scorerResults }) => {
    console.log({
      score: scorerResults[scorer.id].score,
    });
  },
});

console.log(result.scores);

For more details on runEvals, see the runEvals reference.

To add this scorer to an agent, see the Scorers overview guide.

ParametersDirect link to Parameters

.run() ReturnsDirect link to .run() Returns

runId:

preprocessStepResult:

analyzeStepResult:

score:

Scoring DetailsDirect link to Scoring Details

Scoring ProcessDirect link to Scoring Process

Score interpretationDirect link to Score interpretation

Special CasesDirect link to Special Cases

ExampleDirect link to Example

RelatedDirect link to Related