Skip to Content
ExamplesEvalsKeyword Coverage

Keyword Coverage Evaluation

New Scorer API

We just released a new evals API called Scorers, with a more ergonomic API and more metadata stored for error analysis, and more flexibility to evaluate data structures. It’s fairly simple to migrate, but we will continue to support the existing Evals API.

Use KeywordCoverageMetric to evaluate how accurately a response includes required keywords or phrases from the context. The metric accepts a query and a response, and returns a score and an info object containing keyword match statistics.

Installation

npm install @mastra/evals

Full coverage example

In this example, the response fully reflects the key terms from the input. All required keywords are present, resulting in complete coverage with no omissions.

src/example-full-keyword-coverage.ts
import { KeywordCoverageMetric } from "@mastra/evals/nlp"; const metric = new KeywordCoverageMetric(); const query = "JavaScript frameworks like React and Vue."; const response = "Popular JavaScript frameworks include React and Vue for web development"; const result = await metric.measure(query, response); console.log(result);

Full coverage output

A score of 1 indicates that all expected keywords were found in the response. The info field confirms that the number of matched keywords equals the total number extracted from the input.

{ score: 1, info: { totalKeywords: 4, matchedKeywords: 4 } }

Partial coverage example

In this example, the response includes some, but not all, of the important keywords from the input. The score reflects partial coverage, with key terms either missing or only partially matched.

src/example-partial-keyword-coverage.ts
import { KeywordCoverageMetric } from "@mastra/evals/nlp"; const metric = new KeywordCoverageMetric(); const query = "TypeScript offers interfaces, generics, and type inference."; const response = "TypeScript provides type inference and some advanced features"; const result = await metric.measure(query, response); console.log(result);

Partial coverage output

A score of 0.5 indicates that only half of the expected keywords were found in the response. The info field shows how many terms were matched compared to the total identified in the input.

{ score: 0.5, info: { totalKeywords: 6, matchedKeywords: 3 } }

Minimal coverage example

In this example, the response includes very few of the important keywords from the input. The score reflects minimal coverage, with most key terms missing or unaccounted for.

src/example-minimal-keyword-coverage.ts
import { KeywordCoverageMetric } from "@mastra/evals/nlp"; const metric = new KeywordCoverageMetric(); const query = "Machine learning models require data preprocessing, feature engineering, and hyperparameter tuning"; const response = "Data preparation is important for models"; const result = await metric.measure(query, response); console.log(result);

Minimal coverage output

A low score indicates that only a small number of the expected keywords were present in the response. The info field highlights the gap between total and matched keywords, signaling insufficient coverage.

{ score: 0.2, info: { totalKeywords: 10, matchedKeywords: 2 } }

Metric configuration

You can create a KeywordCoverageMetric instance with default settings. No additional configuration is required.

const metric = new KeywordCoverageMetric();

See KeywordCoverageMetric for a full list of configuration options.

Understanding the results

KeywordCoverageMetric returns a result in the following shape:

{ score: number, info: { totalKeywords: number, matchedKeywords: number } }

Keyword coverage score

A coverage score between 0 and 1:

  • 1.0: Complete coverage – all keywords present.
  • 0.7–0.9: High coverage – most keywords included.
  • 0.4–0.6: Partial coverage – some keywords present.
  • 0.1–0.3: Low coverage – few keywords matched.
  • 0.0: No coverage – no keywords found.

Keyword coverage info

Detailed statistics including:

  • Total keywords from input.
  • Number of matched keywords.
  • Coverage ratio calculation.
  • Technical term handling.
View Example on GitHub