Skip to main content

Tone Consistency Scorer

The createToneScorer() function evaluates the text's emotional tone and sentiment consistency. It can operate in two modes: comparing tone between input/output pairs or analyzing tone stability within a single text.

Parameters

The createToneScorer() function does not take any options.

This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.

.run() Returns

runId:

string
The id of the run (optional).

analyzeStepResult:

object
Object with tone metrics: { responseSentiment: number, referenceSentiment: number, difference: number } (for comparison mode) OR { avgSentiment: number, sentimentVariance: number } (for stability mode)

score:

number
Tone consistency/stability score (0-1).

.run() returns a result in the following shape:

{
runId: string,
analyzeStepResult: {
responseSentiment?: number,
referenceSentiment?: number,
difference?: number,
avgSentiment?: number,
sentimentVariance?: number,
},
score: number
}

Scoring Details

The scorer evaluates sentiment consistency through tone pattern analysis and mode-specific scoring.

Scoring Process

  1. Analyzes tone patterns:
    • Extracts sentiment features
    • Computes sentiment scores
    • Measures tone variations
  2. Calculates mode-specific score: Tone Consistency (input and output):
    • Compares sentiment between texts
    • Calculates sentiment difference
    • Score = 1 - (sentiment_difference / max_difference) Tone Stability (single input):
    • Analyzes sentiment across sentences
    • Calculates sentiment variance
    • Score = 1 - (sentiment_variance / max_variance)

Final score: mode_specific_score * scale

Score interpretation

(0 to scale, default 0-1)

  • 1.0: Perfect tone consistency/stability
  • 0.7-0.9: Strong consistency with minor variations
  • 0.4-0.6: Moderate consistency with noticeable shifts
  • 0.1-0.3: Poor consistency with major tone changes
  • 0.0: No consistency - completely different tones

analyzeStepResult

Object with tone metrics:

  • responseSentiment: Sentiment score for the response (comparison mode).
  • referenceSentiment: Sentiment score for the input/reference (comparison mode).
  • difference: Absolute difference between sentiment scores (comparison mode).
  • avgSentiment: Average sentiment across sentences (stability mode).
  • sentimentVariance: Variance of sentiment across sentences (stability mode).

Example

Evaluate tone consistency between related agent responses:

src/example-tone-consistency.ts
import { runEvals } from "@mastra/core/evals";
import { createToneScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createToneScorer();

const result = await runEvals({
data: [
{
input: "How was your experience with our service?",
groundTruth: "The service was excellent and exceeded expectations!",
},
{
input: "Tell me about the customer support",
groundTruth: "The support team was friendly and very helpful.",
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
});
},
});

console.log(result.scores);

For more details on runEvals, see the runEvals reference.

To add this scorer to an agent, see the Scorers overview guide.