Skip to Content
ReferenceEvalsContextPosition

ContextPositionMetric

New Scorer API

We just released a new evals API called Scorers, with a more ergonomic API and more metadata stored for error analysis, and more flexibility to evaluate data structures. It’s fairly simple to migrate, but we will continue to support the existing Evals API.

The ContextPositionMetric class evaluates how well context nodes are ordered based on their relevance to the query and output. It uses position-weighted scoring to emphasize the importance of having the most relevant context pieces appear earlier in the sequence.

Basic Usage

import { openai } from "@ai-sdk/openai"; import { ContextPositionMetric } from "@mastra/evals/llm"; // Configure the model for evaluation const model = openai("gpt-4o-mini"); const metric = new ContextPositionMetric(model, { context: [ "Photosynthesis is a biological process used by plants to create energy from sunlight.", "The process of photosynthesis produces oxygen as a byproduct.", "Plants need water and nutrients from the soil to grow.", ], }); const result = await metric.measure( "What is photosynthesis?", "Photosynthesis is the process by which plants convert sunlight into energy.", ); console.log(result.score); // Position score from 0-1 console.log(result.info.reason); // Explanation of the score

Constructor Parameters

model:

ModelConfig
Configuration for the model used to evaluate context positioning

options:

ContextPositionMetricOptions
Configuration options for the metric

ContextPositionMetricOptions

scale?:

number
= 1
Maximum score value

context:

string[]
Array of context pieces in their retrieval order

measure() Parameters

input:

string
The original query or prompt

output:

string
The generated response to evaluate

Returns

score:

number
Position score (0 to scale, default 0-1)

info:

object
Object containing the reason for the score
string

reason:

string
Detailed explanation of the score

Scoring Details

The metric evaluates context positioning through binary relevance assessment and position-based weighting.

Scoring Process

  1. Evaluates context relevance:

    • Assigns binary verdict (yes/no) to each piece
    • Records position in sequence
    • Documents relevance reasoning
  2. Applies position weights:

    • Earlier positions weighted more heavily (weight = 1/(position + 1))
    • Sums weights of relevant pieces
    • Normalizes by maximum possible score

Final score: (weighted_sum / max_possible_sum) * scale

Score interpretation

(0 to scale, default 0-1)

  • 1.0: Optimal - most relevant context first
  • 0.7-0.9: Good - relevant context mostly early
  • 0.4-0.6: Mixed - relevant context scattered
  • 0.1-0.3: Suboptimal - relevant context mostly later
  • 0.0: Poor ordering - relevant context at end or missing

Example with Analysis

import { openai } from "@ai-sdk/openai"; import { ContextPositionMetric } from "@mastra/evals/llm"; // Configure the model for evaluation const model = openai("gpt-4o-mini"); const metric = new ContextPositionMetric(model, { context: [ "A balanced diet is important for health.", "Exercise strengthens the heart and improves blood circulation.", "Regular physical activity reduces stress and anxiety.", "Exercise equipment can be expensive.", ], }); const result = await metric.measure( "What are the benefits of exercise?", "Regular exercise improves cardiovascular health and mental wellbeing.", ); // Example output: // { // score: 0.5, // info: { // reason: "The score is 0.5 because while the second and third contexts are highly // relevant to the benefits of exercise, they are not optimally positioned at // the beginning of the sequence. The first and last contexts are not relevant // to the query, which impacts the position-weighted scoring." // } // }