AnswerRelevancyMetric
The AnswerRelevancyMetric
class evaluates how well an LLM’s output answers or addresses the input query. It uses a judge-based system to determine relevancy and provides detailed scoring and reasoning.
Basic Usage
import { AnswerRelevancyMetric } from "@mastra/evals/llm";
// Configure the model for evaluation
const model = {
provider: "OPEN_AI",
name: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY,
};
const metric = new AnswerRelevancyMetric(model, {
uncertaintyWeight: 0.3,
scale: 1,
});
const result = await metric.measure(
"What is the capital of France?",
"Paris is the capital of France.",
);
console.log(result.score); // Score from 0-1
console.log(result.info.reason); // Explanation of the score
Constructor Parameters
model:
ModelConfig
Configuration for the model used to evaluate relevancy
options?:
AnswerRelevancyMetricOptions
= { uncertaintyWeight: 0.3, scale: 1 }
Configuration options for the metric
AnswerRelevancyMetricOptions
uncertaintyWeight?:
number
= 0.3
Weight given to 'unsure' verdicts in scoring (0-1)
scale?:
number
= 1
Maximum score value
measure() Parameters
input:
string
The original query or prompt
output:
string
The LLM's response to evaluate
Returns
score:
number
Relevancy score (0 to scale, default 0-1)
info:
object
Object containing the reason for the score
string
reason:
string
Explanation of the score
Scoring Details
The metric evaluates relevancy through multiple verdicts and calculates a score based on:
- Direct relevance to the query
- Completeness of the answer
- Accuracy of information
- Appropriate level of detail
Score interpretation:
- 1.0: Perfect relevance
- 0.7-0.9: High relevance with minor issues
- 0.4-0.6: Moderate relevance with significant gaps
- 0.1-0.3: Low relevance with major issues
- 0: Completely irrelevant or incorrect
Example with Custom Configuration
const metric = new AnswerRelevancyMetric(
{
provider: "OPEN_AI",
name: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY,
},
{
uncertaintyWeight: 0.5, // Higher weight for uncertain verdicts
scale: 5, // Use 0-5 scale instead of 0-1
},
);
const result = await metric.measure(
"What are the benefits of exercise?",
"Regular exercise improves cardiovascular health, builds strength, and boosts mental wellbeing.",
);
// Example output:
// {
// score: 4.5,
// info: {
// reason: "The score is 4.5 out of 5 because the response directly addresses the query
// with specific, accurate benefits of exercise. It covers multiple aspects
// (cardiovascular, muscular, and mental health) in a clear and concise manner.
// The answer is highly relevant and provides appropriate detail without
// including unnecessary information."
// }
// }