Completeness Scorer

The createCompletenessScorer() function evaluates how thoroughly an LLM's output covers the key elements present in the input. It analyzes nouns, verbs, topics, and terms to determine coverage and provides a detailed completeness score.

Parameters

The createCompletenessScorer() function does not take any options.

This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.

.run() Returns

runId:

string

The id of the run (optional).

preprocessStepResult:

object

Object with extracted elements and coverage details: { inputElements: string[], outputElements: string[], missingElements: string[], elementCounts: { input: number, output: number } }

score:

number

Completeness score (0-1) representing the proportion of input elements covered in the output.

The .run() method returns a result in the following shape:

{
  runId: string,
  extractStepResult: {
    inputElements: string[],
    outputElements: string[],
    missingElements: string[],
    elementCounts: { input: number, output: number }
  },
  score: number
}

Element Extraction Details

The scorer extracts and analyzes several types of elements:

Nouns: Key objects, concepts, and entities
Verbs: Actions and states (converted to infinitive form)
Topics: Main subjects and themes
Terms: Individual significant words

The extraction process includes:

Normalization of text (removing diacritics, converting to lowercase)
Splitting camelCase words
Handling of word boundaries
Special handling of short words (3 characters or less)
Deduplication of elements

extractStepResult

From the .run() method, you can get the extractStepResult object with the following properties:

inputElements: Key elements found in the input (e.g., nouns, verbs, topics, terms).
outputElements: Key elements found in the output.
missingElements: Input elements not found in the output.
elementCounts: The number of elements in the input and output.

Scoring Details

The scorer evaluates completeness through linguistic element coverage analysis.

Scoring Process

Extracts key elements:
- Nouns and named entities
- Action verbs
- Topic-specific terms
- Normalized word forms
Calculates coverage of input elements:
- Exact matches for short terms (≤3 chars)
- Substantial overlap (>60%) for longer terms

Final score: (covered_elements / total_input_elements) * scale

Score interpretation

A completeness score between 0 and 1:

1.0: Thoroughly addresses all aspects of the query with comprehensive detail.
0.7–0.9: Covers most important aspects with good detail, minor gaps.
0.4–0.6: Addresses some key points but missing important aspects or lacking detail.
0.1–0.3: Only partially addresses the query with significant gaps.
0.0: Fails to address the query or provides irrelevant information.

Example

Evaluate agent responses for completeness across different query complexities:

src/example-completeness.ts
import { runEvals } from "@mastra/core/evals";
import { createCompletenessScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createCompletenessScorer();

const result = await runEvals({
  data: [
    {
      input:
        "Explain the process of photosynthesis, including the inputs, outputs, and stages involved.",
    },
    {
      input:
        "What are the benefits and drawbacks of remote work for both employees and employers?",
    },
    {
      input:
        "Compare renewable and non-renewable energy sources in terms of cost, environmental impact, and sustainability.",
    },
  ],
  scorers: [scorer],
  target: myAgent,
  onItemComplete: ({ scorerResults }) => {
    console.log({
      score: scorerResults[scorer.id].score,
    });
  },
});

console.log(result.scores);

For more details on runEvals, see the runEvals reference.

To add this scorer to an agent, see the Scorers overview guide.

Parameters​

.run() Returns​

runId:

preprocessStepResult:

score:

Element Extraction Details​

extractStepResult​

Scoring Details​

Scoring Process​

Score interpretation​

Example​

Related​