ContextPositionMetric
The ContextPositionMetric
class evaluates how well context nodes are ordered based on their relevance to the query and output. It uses position-weighted scoring to emphasize the importance of having the most relevant context pieces appear earlier in the sequence.
Basic Usage
import { ContextPositionMetric } from "@mastra/evals/llm";
// Configure the model for evaluation
const model = {
provider: "OPEN_AI",
name: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY,
};
const metric = new ContextPositionMetric(model, {
context: [
"Photosynthesis is a biological process used by plants to create energy from sunlight.",
"The process of photosynthesis produces oxygen as a byproduct.",
"Plants need water and nutrients from the soil to grow.",
],
});
const result = await metric.measure(
"What is photosynthesis?",
"Photosynthesis is the process by which plants convert sunlight into energy.",
);
console.log(result.score); // Position score from 0-1
console.log(result.info.reason); // Explanation of the score
Constructor Parameters
model:
ModelConfig
Configuration for the model used to evaluate context positioning
options:
ContextPositionMetricOptions
Configuration options for the metric
ContextPositionMetricOptions
scale?:
number
= 1
Maximum score value
context:
string[]
Array of context pieces in their retrieval order
measure() Parameters
input:
string
The original query or prompt
output:
string
The generated response to evaluate
Returns
score:
number
Position score (0 to scale, default 0-1)
info:
object
Object containing the reason for the score
string
reason:
string
Detailed explanation of the score
Scoring Details
The metric evaluates context positioning through:
- Individual assessment of each context piece’s relevance
- Position-based weighting (1/position)
- Binary relevance verdicts (yes/no) with detailed reasoning
- Normalization against optimal ordering
The scoring process:
- Evaluates relevance of each context piece
- Applies position weights (earlier positions weighted more heavily)
- Sums weighted relevance scores
- Normalizes against maximum possible score
- Scales to configured range (default 0-1)
Score interpretation:
- 1.0: Most relevant context at the beginning, optimal ordering
- 0.7-0.9: Relevant context mostly at the beginning
- 0.4-0.6: Mixed ordering of relevant context
- 0.1-0.3: Relevant context mostly at the end
- 0: No relevant context or worst possible ordering
Example with Analysis
const metric = new ContextPositionMetric(model, {
context: [
"A balanced diet is important for health.",
"Exercise strengthens the heart and improves blood circulation.",
"Regular physical activity reduces stress and anxiety.",
"Exercise equipment can be expensive.",
],
});
const result = await metric.measure(
"What are the benefits of exercise?",
"Regular exercise improves cardiovascular health and mental wellbeing.",
);
// Example output:
// {
// score: 0.5,
// info: {
// reason: "The score is 0.5 because while the second and third contexts are highly
// relevant to the benefits of exercise, they are not optimally positioned at
// the beginning of the sequence. The first and last contexts are not relevant
// to the query, which impacts the position-weighted scoring."
// }
// }