Tone Consistency Scorer
The createToneScorer() function evaluates the text's emotional tone and sentiment consistency. It can operate in two modes: comparing tone between input/output pairs or analyzing tone stability within a single text.
Parameters
The createToneScorer() function does not take any options.
This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.
.run() Returns
runId:
analyzeStepResult:
score:
.run() returns a result in the following shape:
{
runId: string,
analyzeStepResult: {
responseSentiment?: number,
referenceSentiment?: number,
difference?: number,
avgSentiment?: number,
sentimentVariance?: number,
},
score: number
}
Scoring Details
The scorer evaluates sentiment consistency through tone pattern analysis and mode-specific scoring.
Scoring Process
- Analyzes tone patterns:
- Extracts sentiment features
- Computes sentiment scores
- Measures tone variations
- Calculates mode-specific score:
Tone Consistency (input and output):
- Compares sentiment between texts
- Calculates sentiment difference
- Score = 1 - (sentiment_difference / max_difference) Tone Stability (single input):
- Analyzes sentiment across sentences
- Calculates sentiment variance
- Score = 1 - (sentiment_variance / max_variance)
Final score: mode_specific_score * scale
Score interpretation
(0 to scale, default 0-1)
- 1.0: Perfect tone consistency/stability
- 0.7-0.9: Strong consistency with minor variations
- 0.4-0.6: Moderate consistency with noticeable shifts
- 0.1-0.3: Poor consistency with major tone changes
- 0.0: No consistency - completely different tones
analyzeStepResult
Object with tone metrics:
- responseSentiment: Sentiment score for the response (comparison mode).
- referenceSentiment: Sentiment score for the input/reference (comparison mode).
- difference: Absolute difference between sentiment scores (comparison mode).
- avgSentiment: Average sentiment across sentences (stability mode).
- sentimentVariance: Variance of sentiment across sentences (stability mode).
Example
Evaluate tone consistency between related agent responses:
import { runEvals } from "@mastra/core/evals";
import { createToneScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";
const scorer = createToneScorer();
const result = await runEvals({
data: [
{
input: "How was your experience with our service?",
groundTruth: "The service was excellent and exceeded expectations!",
},
{
input: "Tell me about the customer support",
groundTruth: "The support team was friendly and very helpful.",
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
});
},
});
console.log(result.scores);
For more details on runEvals, see the runEvals reference.
To add this scorer to an agent, see the Scorers overview guide.