Skip to main content

Completeness Scorer

The createCompletenessScorer() function evaluates how thoroughly an LLM's output covers the key elements present in the input. It analyzes nouns, verbs, topics, and terms to determine coverage and provides a detailed completeness score.

Parameters

The createCompletenessScorer() function does not take any options.

This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.

.run() Returns

runId:

string
The id of the run (optional).

preprocessStepResult:

object
Object with extracted elements and coverage details: { inputElements: string[], outputElements: string[], missingElements: string[], elementCounts: { input: number, output: number } }

score:

number
Completeness score (0-1) representing the proportion of input elements covered in the output.

The .run() method returns a result in the following shape:

{
runId: string,
extractStepResult: {
inputElements: string[],
outputElements: string[],
missingElements: string[],
elementCounts: { input: number, output: number }
},
score: number
}

Element Extraction Details

The scorer extracts and analyzes several types of elements:

  • Nouns: Key objects, concepts, and entities
  • Verbs: Actions and states (converted to infinitive form)
  • Topics: Main subjects and themes
  • Terms: Individual significant words

The extraction process includes:

  • Normalization of text (removing diacritics, converting to lowercase)
  • Splitting camelCase words
  • Handling of word boundaries
  • Special handling of short words (3 characters or less)
  • Deduplication of elements

extractStepResult

From the .run() method, you can get the extractStepResult object with the following properties:

  • inputElements: Key elements found in the input (e.g., nouns, verbs, topics, terms).
  • outputElements: Key elements found in the output.
  • missingElements: Input elements not found in the output.
  • elementCounts: The number of elements in the input and output.

Scoring Details

The scorer evaluates completeness through linguistic element coverage analysis.

Scoring Process

  1. Extracts key elements:
    • Nouns and named entities
    • Action verbs
    • Topic-specific terms
    • Normalized word forms
  2. Calculates coverage of input elements:
    • Exact matches for short terms (≤3 chars)
    • Substantial overlap (>60%) for longer terms

Final score: (covered_elements / total_input_elements) * scale

Score interpretation

A completeness score between 0 and 1:

  • 1.0: Thoroughly addresses all aspects of the query with comprehensive detail.
  • 0.7–0.9: Covers most important aspects with good detail, minor gaps.
  • 0.4–0.6: Addresses some key points but missing important aspects or lacking detail.
  • 0.1–0.3: Only partially addresses the query with significant gaps.
  • 0.0: Fails to address the query or provides irrelevant information.

Example

Evaluate agent responses for completeness across different query complexities:

src/example-completeness.ts
import { runEvals } from "@mastra/core/evals";
import { createCompletenessScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createCompletenessScorer();

const result = await runEvals({
data: [
{
input:
"Explain the process of photosynthesis, including the inputs, outputs, and stages involved.",
},
{
input:
"What are the benefits and drawbacks of remote work for both employees and employers?",
},
{
input:
"Compare renewable and non-renewable energy sources in terms of cost, environmental impact, and sustainability.",
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
});
},
});

console.log(result.scores);

For more details on runEvals, see the runEvals reference.

To add this scorer to an agent, see the Scorers overview guide.