Keyword Coverage Scorer
Use createKeywordCoverageScorer
to evaluate how accurately a response includes required keywords or phrases from the context. The scorer accepts a query
and a response
, and returns a score and an info
object containing keyword match statistics.
Installation
npm install @mastra/evals
For complete API documentation and configuration options, see
createKeywordCoverageScorer
.
Full coverage example
In this example, the response fully reflects the key terms from the input. All required keywords are present, resulting in complete coverage with no omissions.
import { createKeywordCoverageScorer } from "@mastra/evals/scorers/code";
const scorer = createKeywordCoverageScorer();
const input = 'JavaScript frameworks like React and Vue';
const output = 'Popular JavaScript frameworks include React and Vue for web development';
const result = await scorer.run({
input: [{ role: 'user', content: input }],
output: { role: 'assistant', text: output },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);
Full coverage output
A score of 1 indicates that all expected keywords were found in the response. The analyzeStepResult
field confirms that the number of matched keywords equals the total number extracted from the input.
{
score: 1,
analyzeStepResult: {
totalKeywords: 4,
matchedKeywords: 4
}
}
Partial coverage example
In this example, the response includes some, but not all, of the important keywords from the input. The score reflects partial coverage, with key terms either missing or only partially matched.
import { createKeywordCoverageScorer } from "@mastra/evals/scorers/code";
const scorer = createKeywordCoverageScorer();
const input = 'TypeScript offers interfaces, generics, and type inference';
const output = 'TypeScript provides type inference and some advanced features';
const result = await scorer.run({
input: [{ role: 'user', content: input }],
output: { role: 'assistant', text: output },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);
Partial coverage output
A score of 0.5 indicates that only half of the expected keywords were found in the response. The analyzeStepResult
field shows how many terms were matched compared to the total identified in the input.
{
score: 0.5,
analyzeStepResult: {
totalKeywords: 6,
matchedKeywords: 3
}
}
Minimal coverage example
In this example, the response includes very few of the important keywords from the input. The score reflects minimal coverage, with most key terms missing or unaccounted for.
import { createKeywordCoverageScorer } from "@mastra/evals/scorers/code";
const scorer = createKeywordCoverageScorer();
const input = 'Machine learning models require data preprocessing, feature engineering, and hyperparameter tuning';
const output = 'Data preparation is important for models';
const result = await scorer.run({
input: [{ role: 'user', content: input }],
output: { role: 'assistant', text: output },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);
Minimal coverage output
A low score indicates that only a small number of the expected keywords were present in the response. The analyzeStepResult
field highlights the gap between total and matched keywords, signaling insufficient coverage.
{
score: 0.2,
analyzeStepResult: {
totalKeywords: 10,
matchedKeywords: 2
}
}
Metric configuration
You can create a KeywordCoverageMetric
instance with default settings. No additional configuration is required.
const metric = new KeywordCoverageMetric();
See KeywordCoverageScorer for a full list of configuration options.
Understanding the results
.run()
returns a result in the following shape:
{
runId: string,
extractStepResult: {
referenceKeywords: Set<string>,
responseKeywords: Set<string>
},
analyzeStepResult: {
totalKeywords: number,
matchedKeywords: number
},
score: number
}
score
A coverage score between 0 and 1:
- 1.0: Complete coverage – all keywords present.
- 0.7–0.9: High coverage – most keywords included.
- 0.4–0.6: Partial coverage – some keywords present.
- 0.1–0.3: Low coverage – few keywords matched.
- 0.0: No coverage – no keywords found.
runId
The unique identifier for this scorer run.
extractStepResult
Object with extracted keywords:
- referenceKeywords: Set of keywords extracted from the input.
- responseKeywords: Set of keywords extracted from the output.
analyzeStepResult
Object with keyword coverage statistics:
- totalKeywords: The number of keywords expected (from the input).
- matchedKeywords: The number of keywords found in the response.