Completeness Evaluation

note

The Scorers API is currently in beta. We're actively working on improvements and appreciate your feedback. For questions or feature requests, please reach out on Discord.

Use CompletenessMetric to evaluate whether the response includes all key elements from the input. The metric accepts a query and a response, and returns a score and an info object with detailed element level comparisons.

InstallationDirect link to Installation

npm install @mastra/evals

Complete coverage exampleDirect link to Complete coverage example

In this example, the response contains every element from the input. The content matches exactly, resulting in full coverage.

src/example-complete-coverage.ts
import { CompletenessMetric } from "@mastra/evals/nlp";

const metric = new CompletenessMetric();

const query = "The primary colors are red, blue, and yellow.";
const response = "The primary colors are red, blue, and yellow.";

const result = await metric.measure(query, response);

console.log(result);

Complete coverage outputDirect link to Complete coverage output

The output receives a score of 1 because all input elements are present in the response with no missing content.

{
  score: 1,
  info: {
    inputElements: [
      'the',    'primary',
      'colors', 'be',
      'red',    'blue',
      'and',    'yellow'
    ],
    outputElements: [
      'the',    'primary',
      'colors', 'be',
      'red',    'blue',
      'and',    'yellow'
    ],
    missingElements: [],
    elementCounts: { input: 8, output: 8 }
  }
}

Partial coverage exampleDirect link to Partial coverage example

In this example, the response includes all of the input elements, but also adds extra content that wasn’t in the original query.

src/example-partial-coverage.ts
import { CompletenessMetric } from "@mastra/evals/nlp";

const metric = new CompletenessMetric();

const query = "The primary colors are red and blue.";
const response = "The primary colors are red, blue, and yellow.";

const result = await metric.measure(query, response);

console.log(result);

Partial coverage outputDirect link to Partial coverage output

The output receives a high score because no input elements are missing. However, the response includes additional content that goes beyond the input.

{
  score: 1,
  info: {
    inputElements: [
      'the',    'primary',
      'colors', 'be',
      'red',    'and',
      'blue'
    ],
    outputElements: [
      'the',    'primary',
      'colors', 'be',
      'red',    'blue',
      'and',    'yellow'
    ],
    missingElements: [],
    elementCounts: { input: 7, output: 8 }
  }
}

Minimal coverage exampleDirect link to Minimal coverage example

In this example, the response contains only some of the elements from the input. Key terms are missing or altered, resulting in reduced coverage.

src/example-minimal-coverage.ts
import { CompletenessMetric } from "@mastra/evals/nlp";

const metric = new CompletenessMetric();

const query = "The seasons include summer.";
const response = "The four seasons are spring, summer, fall, and winter.";

const result = await metric.measure(query, response);

console.log(result);

Minimal coverage outputDirect link to Minimal coverage output

The output receives a lower score because one or more elements from the input are missing. The response overlaps in part, but does not fully reflect the original content.

{
  score: 0.75,
  info: {
    inputElements: [ 'the', 'seasons', 'summer', 'include' ],
    outputElements: [
      'the',     'four',
      'seasons', 'spring',
      'summer',  'winter',
      'be',      'fall',
      'and'
    ],
    missingElements: [ 'include' ],
    elementCounts: { input: 4, output: 9 }
  }
}

Metric configurationDirect link to Metric configuration

You can create a CompletenessMetric instance with default settings. No additional configuration is required.

const metric = new CompletenessMetric();

See CompletenessMetric for a full list of configuration options.

Understanding the resultsDirect link to Understanding the results

CompletenessMetric returns a result in the following shape:

{
  score: number,
  info: {
    inputElements: string[],
    outputElements: string[],
    missingElements: string[],
    elementCounts: {
      input: number,
      output: number
    }
  }
}

Completeness scoreDirect link to Completeness score

A completeness score between 0 and 1:

1.0: All input elements are present in the response.
0.7–0.9: Most key elements are included, with minimal omissions.
0.4–0.6: Some input elements are covered, but important ones are missing.
0.1–0.3: Few input elements are matched; most are missing.
0.0: No input elements are present in the response.

Completeness infoDirect link to Completeness info

An explanation for the score, with details including:

Input elements extracted from the query.
Output elements matched in the response.
Any input elements missing from the response.
Comparison of element counts between input and output.

View source on GitHub