Skip to main content

Mastra v1 is coming in January 2026. Get ahead by starting new projects with the beta or upgrade your existing project today.

ContextRelevancyMetric

Scorers

This documentation refers to the legacy evals API. For the latest scorer features, see Scorers.

The ContextRelevancyMetric class evaluates the quality of your RAG (Retrieval-Augmented Generation) pipeline's retriever by measuring how relevant the retrieved context is to the input query. It uses an LLM-based evaluation system that first extracts statements from the context and then assesses their relevance to the input.

Basic UsageDirect link to Basic Usage

import { openai } from "@ai-sdk/openai";
import { ContextRelevancyMetric } from "@mastra/evals/llm";

// Configure the model for evaluation
const model = openai("gpt-4o-mini");

const metric = new ContextRelevancyMetric(model, {
  context: [
    "All data is encrypted at rest and in transit",
    "Two-factor authentication is mandatory",
    "The platform supports multiple languages",
    "Our offices are located in San Francisco",
  ],
});

const result = await metric.measure(
  "What are our product's security features?",
  "Our product uses encryption and requires 2FA.",
);

console.log(result.score); // Score from 0-1
console.log(result.info.reason); // Explanation of the relevancy assessment

Constructor ParametersDirect link to Constructor Parameters

model:

LanguageModel

Configuration for the model used to evaluate context relevancy

options:

ContextRelevancyMetricOptions

Configuration options for the metric

ContextRelevancyMetricOptionsDirect link to ContextRelevancyMetricOptions

scale?:

number

= 1

Maximum score value

context:

string[]

Array of retrieved context documents used to generate the response

measure() ParametersDirect link to measure() Parameters

input:

string

The original query or prompt

output:

string

The LLM's response to evaluate

ReturnsDirect link to Returns

score:

number

Context relevancy score (0 to scale, default 0-1)

info:

object

Object containing the reason for the score

string

reason:

string

Detailed explanation of the relevancy assessment

Scoring DetailsDirect link to Scoring Details

The metric evaluates how well retrieved context matches the query through binary relevance classification.

Scoring ProcessDirect link to Scoring Process

Extracts statements from context:
- Breaks down context into meaningful units
- Preserves semantic relationships
Evaluates statement relevance:
- Assesses each statement against query
- Counts relevant statements
- Calculates relevance ratio

Final score: (relevant_statements / total_statements) * scale

Score interpretationDirect link to Score interpretation

(0 to scale, default 0-1)

1.0: Perfect relevancy - all retrieved context is relevant
0.7-0.9: High relevancy - most context is relevant with few irrelevant pieces
0.4-0.6: Moderate relevancy - a mix of relevant and irrelevant context
0.1-0.3: Low relevancy - mostly irrelevant context
0.0: No relevancy - completely irrelevant context

Example with Custom ConfigurationDirect link to Example with Custom Configuration

import { openai } from "@ai-sdk/openai";
import { ContextRelevancyMetric } from "@mastra/evals/llm";

// Configure the model for evaluation
const model = openai("gpt-4o-mini");

const metric = new ContextRelevancyMetric(model, {
  scale: 100, // Use 0-100 scale instead of 0-1
  context: [
    "Basic plan costs $10/month",
    "Pro plan includes advanced features at $30/month",
    "Enterprise plan has custom pricing",
    "Our company was founded in 2020",
    "We have offices worldwide",
  ],
});

const result = await metric.measure(
  "What are our pricing plans?",
  "We offer Basic, Pro, and Enterprise plans.",
);

// Example output:
// {
//   score: 60,
//   info: {
//     reason: "3 out of 5 statements are relevant to pricing plans. The statements about
//           company founding and office locations are not relevant to the pricing query."
//   }
// }

On this page

Basic Usage
Constructor Parameters
- ContextRelevancyMetricOptions
measure() Parameters
Returns
Scoring Details
- Scoring Process
- Score interpretation
Example with Custom Configuration
Related