KeywordCoverageMetric
The KeywordCoverageMetric
class evaluates how well an LLM’s output covers the important keywords from the input. It analyzes keyword presence and matches while ignoring common words and stop words.
Basic Usage
import { KeywordCoverageMetric } from "@mastra/evals/nlp";
const metric = new KeywordCoverageMetric();
const result = await metric.measure(
"What are the key features of Python programming language?",
"Python is a high-level programming language known for its simple syntax and extensive libraries."
);
console.log(result.score); // Coverage score from 0-1
console.log(result.info); // Object containing detailed metrics about keyword coverage
measure() Parameters
input:
string
The original text containing keywords to be matched
output:
string
The text to evaluate for keyword coverage
Returns
score:
number
Coverage score (0-1) representing the proportion of matched keywords
info:
object
Object containing detailed metrics about keyword coverage
number
matchedKeywords:
number
Number of keywords found in the output
number
totalKeywords:
number
Total number of keywords from the input
Scoring Details
The metric evaluates keyword coverage by matching keywords with the following features:
- Common word and stop word filtering (e.g., “the”, “a”, “and”)
- Case-insensitive matching
- Word form variation handling
- Special handling of technical terms and compound words
Scoring Process
-
Processes keywords from input and output:
- Filters out common words and stop words
- Normalizes case and word forms
- Handles special terms and compounds
-
Calculates keyword coverage:
- Matches keywords between texts
- Counts successful matches
- Computes coverage ratio
Final score: (matched_keywords / total_keywords) * scale
Score interpretation
(0 to scale, default 0-1)
- 1.0: Perfect keyword coverage
- 0.7-0.9: Good coverage with most keywords present
- 0.4-0.6: Moderate coverage with some keywords missing
- 0.1-0.3: Poor coverage with many keywords missing
- 0.0: No keyword matches
Examples with Analysis
import { KeywordCoverageMetric } from "@mastra/evals/nlp";
const metric = new KeywordCoverageMetric();
// Perfect coverage example
const result1 = await metric.measure(
"The quick brown fox jumps over the lazy dog",
"A quick brown fox jumped over a lazy dog"
);
// {
// score: 1.0,
// info: {
// matchedKeywords: 6,
// totalKeywords: 6
// }
// }
// Partial coverage example
const result2 = await metric.measure(
"Python features include easy syntax, dynamic typing, and extensive libraries",
"Python has simple syntax and many libraries"
);
// {
// score: 0.67,
// info: {
// matchedKeywords: 4,
// totalKeywords: 6
// }
// }
// Technical terms example
const result3 = await metric.measure(
"Discuss React.js component lifecycle and state management",
"React components have lifecycle methods and manage state"
);
// {
// score: 1.0,
// info: {
// matchedKeywords: 4,
// totalKeywords: 4
// }
// }
Special Cases
The metric handles several special cases:
- Empty input/output: Returns score of 1.0 if both empty, 0.0 if only one is empty
- Single word: Treated as a single keyword
- Technical terms: Preserves compound technical terms (e.g., “React.js”, “machine learning”)
- Case differences: “JavaScript” matches “javascript”
- Common words: Ignored in scoring to focus on meaningful keywords