Textual Difference Scorer
Use createTextualDifferenceScorer
to evaluate the similarity between two text strings by analyzing sequence differences and edit operations.
Installation
npm install @mastra/evals
For complete API documentation and configuration options, see
createTextualDifferenceScorer
.
No differences example
In this example, the texts are exactly the same. The scorer identifies complete similarity with a perfect score and no detected changes.
import { createTextualDifferenceScorer } from "@mastra/evals/scorers/code";
const scorer = createTextualDifferenceScorer();
const input = 'The quick brown fox jumps over the lazy dog';
const output = 'The quick brown fox jumps over the lazy dog';
const result = await scorer.run({
input: [{ role: 'user', content: input }],
output: { role: 'assistant', text: output },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);
No differences output
The scorer returns a high score, indicating the texts are identical. The detailed info confirms zero changes and no length difference.
{
score: 1,
analyzeStepResult: {
confidence: 1,
ratio: 1,
changes: 0,
lengthDiff: 0,
},
}
Minor differences example
In this example, the texts have small variations. The scorer detects these minor differences and returns a moderate similarity score.
import { createTextualDifferenceScorer } from "@mastra/evals/scorers/code";
const scorer = createTextualDifferenceScorer();
const input = 'Hello world! How are you?';
const output = 'Hello there! How is it going?';
const result = await scorer.run({
input: [{ role: 'user', content: input }],
output: { role: 'assistant', text: output },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);
Minor differences output
The scorer returns a moderate score reflecting the small variations between the texts. The detailed info includes the number of changes and length difference observed.
{
score: 0.5925925925925926,
analyzeStepResult: {
confidence: 0.8620689655172413,
ratio: 0.5925925925925926,
changes: 5,
lengthDiff: 0.13793103448275862
}
}
Major differences example
In this example, the texts differ significantly. The scorer detects extensive changes and returns a low similarity score.
import { createTextualDifferenceScorer } from "@mastra/evals/scorers/code";
const scorer = createTextualDifferenceScorer();
const input = 'Python is a high-level programming language';
const output = 'JavaScript is used for web development';
const result = await scorer.run({
input: [{ role: 'user', content: input }],
output: { role: 'assistant', text: output },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);
Major differences output
The scorer returns a low score due to significant differences between the texts. The detailed analyzeStepResult
shows numerous changes and a notable length difference.
{
score: 0.3170731707317073,
analyzeStepResult: {
confidence: 0.8636363636363636,
ratio: 0.3170731707317073,
changes: 8,
lengthDiff: 0.13636363636363635
}
}
Scorer configuration
You can create a TextualDifferenceScorer
instance with default settings. No additional configuration is required.
const scorer = createTextualDifferenceScorer();
See TextualDifferenceScorer for a full list of configuration options.
Understanding the results
.run()
returns a result in the following shape:
{
runId: string,
analyzeStepResult: {
confidence: number,
ratio: number,
changes: number,
lengthDiff: number
},
score: number
}
score
A textual difference score between 0 and 1:
- 1.0: Identical texts – no differences detected.
- 0.7–0.9: Minor differences – few changes needed.
- 0.4–0.6: Moderate differences – noticeable changes required.
- 0.1–0.3: Major differences – extensive changes needed.
- 0.0: Completely different texts.
runId
The unique identifier for this scorer run.
analyzeStepResult
Object with difference metrics:
- confidence: Confidence score based on length difference (higher is better).
- ratio: Similarity ratio between the texts (0-1).
- changes: Number of edit operations required to match the texts.
- lengthDiff: Normalized difference in text lengths.