Context Relevancy
This example demonstrates how to use Mastra’s Context Relevancy metric to evaluate how relevant context information is to a given query.
Overview
The example shows how to:
- Configure the Context Relevancy metric
- Evaluate context relevancy
- Analyze relevancy scores
- Handle different relevancy levels
Setup
Environment Setup
Make sure to set up your environment variables:
.env
OPENAI_API_KEY=your_api_key_here
Dependencies
Import the necessary dependencies:
src/index.ts
import { openai } from "@ai-sdk/openai";
import { ContextRelevancyMetric } from "@mastra/evals/llm";
Example Usage
High Relevancy Example
Evaluate a response where all context is relevant:
src/index.ts
const context1 = [
"Einstein won the Nobel Prize for his discovery of the photoelectric effect.",
"He published his theory of relativity in 1905.",
"His work revolutionized modern physics.",
];
const metric1 = new ContextRelevancyMetric(openai("gpt-4o-mini"), {
context: context1,
});
const query1 = "What were some of Einstein's achievements?";
const response1 =
"Einstein won the Nobel Prize for discovering the photoelectric effect and published his groundbreaking theory of relativity.";
console.log("Example 1 - High Relevancy:");
console.log("Context:", context1);
console.log("Query:", query1);
console.log("Response:", response1);
const result1 = await metric1.measure(query1, response1);
console.log("Metric Result:", {
score: result1.score,
reason: result1.info.reason,
});
// Example Output:
// Metric Result: { score: 1, reason: 'The context uses all relevant information and does not include any irrelevant information.' }
Mixed Relevancy Example
Evaluate a response where some context is irrelevant:
src/index.ts
const context2 = [
"Solar eclipses occur when the Moon blocks the Sun.",
"The Moon moves between the Earth and Sun during eclipses.",
"The Moon is visible at night.",
"The Moon has no atmosphere.",
];
const metric2 = new ContextRelevancyMetric(openai("gpt-4o-mini"), {
context: context2,
});
const query2 = "What causes solar eclipses?";
const response2 =
"Solar eclipses happen when the Moon moves between Earth and the Sun, blocking sunlight.";
console.log("Example 2 - Mixed Relevancy:");
console.log("Context:", context2);
console.log("Query:", query2);
console.log("Response:", response2);
const result2 = await metric2.measure(query2, response2);
console.log("Metric Result:", {
score: result2.score,
reason: result2.info.reason,
});
// Example Output:
// Metric Result: { score: 0.5, reason: 'The context uses some relevant information and includes some irrelevant information.' }
Low Relevancy Example
Evaluate a response where most context is irrelevant:
src/index.ts
const context3 = [
"The Great Barrier Reef is in Australia.",
"Coral reefs need warm water to survive.",
"Marine life depends on coral reefs.",
"The capital of Australia is Canberra.",
];
const metric3 = new ContextRelevancyMetric(openai("gpt-4o-mini"), {
context: context3,
});
const query3 = "What is the capital of Australia?";
const response3 = "The capital of Australia is Canberra.";
console.log("Example 3 - Low Relevancy:");
console.log("Context:", context3);
console.log("Query:", query3);
console.log("Response:", response3);
const result3 = await metric3.measure(query3, response3);
console.log("Metric Result:", {
score: result3.score,
reason: result3.info.reason,
});
// Example Output:
// Metric Result: { score: 0.12, reason: 'The context only has one relevant piece, while most of the context is irrelevant.' }
Understanding the Results
The metric provides:
-
A relevancy score between 0 and 1:
- 1.0: Perfect relevancy - all context directly relevant to query
- 0.7-0.9: High relevancy - most context relevant to query
- 0.4-0.6: Mixed relevancy - some context relevant to query
- 0.1-0.3: Low relevancy - little context relevant to query
- 0.0: No relevancy - no context relevant to query
-
Detailed reason for the score, including analysis of:
- Relevance to input query
- Statement extraction from context
- Usefulness for response
- Overall context quality
View Example on GitHub