ToneConsistencyMetric
We just released a new evals API called Scorers, with a more ergonomic API and more metadata stored for error analysis, and more flexibility to evaluate data structures. It’s fairly simple to migrate, but we will continue to support the existing Evals API.
The ToneConsistencyMetric
class evaluates the text’s emotional tone and sentiment consistency. It can operate in two modes: comparing tone between input/output pairs or analyzing tone stability within a single text.
Basic Usage
import { ToneConsistencyMetric } from "@mastra/evals/nlp";
const metric = new ToneConsistencyMetric();
// Compare tone between input and output
const result1 = await metric.measure(
"I love this amazing product!",
"This product is wonderful and fantastic!",
);
// Analyze tone stability in a single text
const result2 = await metric.measure(
"The service is excellent. The staff is friendly. The atmosphere is perfect.",
"", // Empty string for single-text analysis
);
console.log(result1.score); // Tone consistency score from 0-1
console.log(result2.score); // Tone stability score from 0-1
measure() Parameters
input:
output:
Returns
score:
info:
info Object (Tone Comparison)
responseSentiment:
referenceSentiment:
difference:
info Object (Tone Stability)
avgSentiment:
sentimentVariance:
Scoring Details
The metric evaluates sentiment consistency through tone pattern analysis and mode-specific scoring.
Scoring Process
-
Analyzes tone patterns:
- Extracts sentiment features
- Computes sentiment scores
- Measures tone variations
-
Calculates mode-specific score: Tone Consistency (input and output):
- Compares sentiment between texts
- Calculates sentiment difference
- Score = 1 - (sentiment_difference / max_difference)
Tone Stability (single input):
- Analyzes sentiment across sentences
- Calculates sentiment variance
- Score = 1 - (sentiment_variance / max_variance)
Final score: mode_specific_score * scale
Score interpretation
(0 to scale, default 0-1)
- 1.0: Perfect tone consistency/stability
- 0.7-0.9: Strong consistency with minor variations
- 0.4-0.6: Moderate consistency with noticeable shifts
- 0.1-0.3: Poor consistency with major tone changes
- 0.0: No consistency - completely different tones
Example with Both Modes
import { ToneConsistencyMetric } from "@mastra/evals/nlp";
const metric = new ToneConsistencyMetric();
// Tone Consistency Mode
const consistencyResult = await metric.measure(
"This product is fantastic and amazing!",
"The product is excellent and wonderful!",
);
// Example output:
// {
// score: 0.95,
// info: {
// responseSentiment: 0.8,
// referenceSentiment: 0.75,
// difference: 0.05
// }
// }
// Tone Stability Mode
const stabilityResult = await metric.measure(
"Great service! Friendly staff. Perfect atmosphere.",
"",
);
// Example output:
// {
// score: 0.9,
// info: {
// avgSentiment: 0.6,
// sentimentVariance: 0.1
// }
// }