Textual Difference Scorer

Use createTextualDifferenceScorer to evaluate the similarity between two text strings by analyzing sequence differences and edit operations.

Installation


npm install @mastra/evals

For complete API documentation and configuration options, see createTextualDifferenceScorer.

No differences example

In this example, the texts are exactly the same. The scorer identifies complete similarity with a perfect score and no detected changes.

src/example-no-differences.ts


import { createTextualDifferenceScorer } from "@mastra/evals/scorers/code";
 
const scorer = createTextualDifferenceScorer();
 
const input = 'The quick brown fox jumps over the lazy dog';
const output = 'The quick brown fox jumps over the lazy dog';
 
const result = await scorer.run({
  input: [{ role: 'user', content: input }],
  output: { role: 'assistant', text: output },
});
 
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);

No differences output

The scorer returns a high score, indicating the texts are identical. The detailed info confirms zero changes and no length difference.


{
  score: 1,
  analyzeStepResult: {
    confidence: 1,
    ratio: 1,
    changes: 0,
    lengthDiff: 0,
  },
}

Minor differences example

In this example, the texts have small variations. The scorer detects these minor differences and returns a moderate similarity score.

src/example-minor-differences.ts


import { createTextualDifferenceScorer } from "@mastra/evals/scorers/code";
 
const scorer = createTextualDifferenceScorer();
 
const input = 'Hello world! How are you?';
const output = 'Hello there! How is it going?';
 
const result = await scorer.run({
  input: [{ role: 'user', content: input }],
  output: { role: 'assistant', text: output },
});
 
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);

Minor differences output

The scorer returns a moderate score reflecting the small variations between the texts. The detailed info includes the number of changes and length difference observed.


{
  score: 0.5925925925925926,
  analyzeStepResult: {
    confidence: 0.8620689655172413,
    ratio: 0.5925925925925926,
    changes: 5,
    lengthDiff: 0.13793103448275862
  }
}

Major differences example

In this example, the texts differ significantly. The scorer detects extensive changes and returns a low similarity score.

src/example-major-differences.ts


import { createTextualDifferenceScorer } from "@mastra/evals/scorers/code";
 
const scorer = createTextualDifferenceScorer();
 
const input = 'Python is a high-level programming language';
const output = 'JavaScript is used for web development';
 
const result = await scorer.run({
  input: [{ role: 'user', content: input }],
  output: { role: 'assistant', text: output },
});
 
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);

Major differences output

The scorer returns a low score due to significant differences between the texts. The detailed analyzeStepResult shows numerous changes and a notable length difference.


{
  score: 0.3170731707317073,
  analyzeStepResult: {
    confidence: 0.8636363636363636,
    ratio: 0.3170731707317073,
    changes: 8,
    lengthDiff: 0.13636363636363635
  }
}

Scorer configuration

You can create a TextualDifferenceScorer instance with default settings. No additional configuration is required.


const scorer = createTextualDifferenceScorer();

See TextualDifferenceScorer for a full list of configuration options.

Understanding the results

.run() returns a result in the following shape:


{
  runId: string,
  analyzeStepResult: {
    confidence: number,
    ratio: number,
    changes: number,
    lengthDiff: number
  },
  score: number
}

score

A textual difference score between 0 and 1:

1.0: Identical texts – no differences detected.
0.7–0.9: Minor differences – few changes needed.
0.4–0.6: Moderate differences – noticeable changes required.
0.1–0.3: Major differences – extensive changes needed.
0.0: Completely different texts.

runId

The unique identifier for this scorer run.

analyzeStepResult

Object with difference metrics:

confidence: Confidence score based on length difference (higher is better).
ratio: Similarity ratio between the texts (0-1).
changes: Number of edit operations required to match the texts.
lengthDiff: Normalized difference in text lengths.

View Example on GitHub