Tone Consistency Scorer

The createToneScorer() function evaluates the text's emotional tone and sentiment consistency. It can operate in two modes: comparing tone between input/output pairs or analyzing tone stability within a single text.

Parameters
Direct link to Parameters

The createToneScorer() function does not take any options.

This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.

.run() Returns
Direct link to .run() Returns

runId:

string

The id of the run (optional).

analyzeStepResult:

object

Object with tone metrics: { responseSentiment: number, referenceSentiment: number, difference: number } (for comparison mode) OR { avgSentiment: number, sentimentVariance: number } (for stability mode)

score:

number

Tone consistency/stability score (0-1).

.run() returns a result in the following shape:

{
  runId: string,
  analyzeStepResult: {
    responseSentiment?: number,
    referenceSentiment?: number,
    difference?: number,
    avgSentiment?: number,
    sentimentVariance?: number,
  },
  score: number
}

Scoring Details
Direct link to Scoring Details

The scorer evaluates sentiment consistency through tone pattern analysis and mode-specific scoring.

Scoring Process
Direct link to Scoring Process

Analyzes tone patterns:
- Extracts sentiment features
- Computes sentiment scores
- Measures tone variations
Calculates mode-specific score: Tone Consistency (input and output):
- Compares sentiment between texts
- Calculates sentiment difference
- Score = 1 - (sentiment_difference / max_difference) Tone Stability (single input):
- Analyzes sentiment across sentences
- Calculates sentiment variance
- Score = 1 - (sentiment_variance / max_variance)

Final score: mode_specific_score * scale

Score interpretation
Direct link to Score interpretation

(0 to scale, default 0-1)

1.0: Perfect tone consistency/stability
0.7-0.9: Strong consistency with minor variations
0.4-0.6: Moderate consistency with noticeable shifts
0.1-0.3: Poor consistency with major tone changes
0.0: No consistency - completely different tones

analyzeStepResult
Direct link to analyzeStepResult

Object with tone metrics:

responseSentiment: Sentiment score for the response (comparison mode).
referenceSentiment: Sentiment score for the input/reference (comparison mode).
difference: Absolute difference between sentiment scores (comparison mode).
avgSentiment: Average sentiment across sentences (stability mode).
sentimentVariance: Variance of sentiment across sentences (stability mode).

Example
Direct link to Example

Evaluate tone consistency between related agent responses:

src/example-tone-consistency.ts
import { runEvals } from "@mastra/core/evals";
import { createToneScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createToneScorer();

const result = await runEvals({
  data: [
    {
      input: "How was your experience with our service?",
      groundTruth: "The service was excellent and exceeded expectations!",
    },
    {
      input: "Tell me about the customer support",
      groundTruth: "The support team was friendly and very helpful.",
    },
  ],
  scorers: [scorer],
  target: myAgent,
  onItemComplete: ({ scorerResults }) => {
    console.log({
      score: scorerResults[scorer.id].score,
    });
  },
});

console.log(result.scores);

For more details on runEvals, see the runEvals reference.

To add this scorer to an agent, see the Scorers overview guide.

ParametersDirect link to Parameters

.run() ReturnsDirect link to .run() Returns

runId:

analyzeStepResult:

score:

Scoring DetailsDirect link to Scoring Details

Scoring ProcessDirect link to Scoring Process

Score interpretationDirect link to Score interpretation

analyzeStepResultDirect link to analyzeStepResult

ExampleDirect link to Example

RelatedDirect link to Related