Tone Consistency Scorer

The createToneScorer() function evaluates the text's emotional tone and sentiment consistency. It can operate in two modes: comparing tone between input/output pairs or analyzing tone stability within a single text.

ParametersDirect link to Parameters

The createToneScorer() function does not take any options.

This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.

.run() ReturnsDirect link to .run() Returns

runId:

string

The id of the run (optional).

analyzeStepResult:

object

Object with tone metrics: { responseSentiment: number, referenceSentiment: number, difference: number } (for comparison mode) OR { avgSentiment: number, sentimentVariance: number } (for stability mode)

score:

number

Tone consistency/stability score (0-1).

.run() returns a result in the following shape:

{
  runId: string,
  analyzeStepResult: {
    responseSentiment?: number,
    referenceSentiment?: number,
    difference?: number,
    avgSentiment?: number,
    sentimentVariance?: number,
  },
  score: number
}

Scoring DetailsDirect link to Scoring Details

The scorer evaluates sentiment consistency through tone pattern analysis and mode-specific scoring.

Scoring ProcessDirect link to Scoring Process

Analyzes tone patterns:
- Extracts sentiment features
- Computes sentiment scores
- Measures tone variations
Calculates mode-specific score: Tone Consistency (input and output):
- Compares sentiment between texts
- Calculates sentiment difference
- Score = 1 - (sentiment_difference / max_difference) Tone Stability (single input):
- Analyzes sentiment across sentences
- Calculates sentiment variance
- Score = 1 - (sentiment_variance / max_variance)

Final score: mode_specific_score * scale

Score interpretationDirect link to Score interpretation

(0 to scale, default 0-1)

1.0: Perfect tone consistency/stability
0.7-0.9: Strong consistency with minor variations
0.4-0.6: Moderate consistency with noticeable shifts
0.1-0.3: Poor consistency with major tone changes
0.0: No consistency - completely different tones

analyzeStepResultDirect link to analyzeStepResult

Object with tone metrics:

responseSentiment: Sentiment score for the response (comparison mode).
referenceSentiment: Sentiment score for the input/reference (comparison mode).
difference: Absolute difference between sentiment scores (comparison mode).
avgSentiment: Average sentiment across sentences (stability mode).
sentimentVariance: Variance of sentiment across sentences (stability mode).

ExampleDirect link to Example

Evaluate tone consistency between related agent responses:

src/example-tone-consistency.ts
import { runExperiment } from "@mastra/core/scores";
import { createToneScorer } from "@mastra/evals/scorers/code";
import { myAgent } from "./agent";

const scorer = createToneScorer();

const result = await runExperiment({
  data: [
    {
      input: "How was your experience with our service?",
      groundTruth: "The service was excellent and exceeded expectations!",
    },
    {
      input: "Tell me about the customer support",
      groundTruth: "The support team was friendly and very helpful.",
    },
  ],
  scorers: [scorer],
  target: myAgent,
  onItemComplete: ({ scorerResults }) => {
    console.log({
      score: scorerResults[scorer.name].score,
    });
  },
});

console.log(result.scores);

For more details on runExperiment, see the runExperiment reference.

To add this scorer to an agent, see the Scorers overview guide.