DocsReferenceEvalsContentSimilarity

ContentSimilarityMetric

The ContentSimilarityMetric class measures the textual similarity between two strings, providing a score that indicates how closely they match. It supports configurable options for case sensitivity and whitespace handling.

Basic Usage

import { ContentSimilarityMetric } from "@mastra/evals/nlp";
 
const metric = new ContentSimilarityMetric({
  ignoreCase: true,
  ignoreWhitespace: true
});
 
const result = await metric.measure(
  "Hello, world!",
  "hello world"
);
 
console.log(result.score); // Similarity score from 0-1
console.log(result.info); // Detailed similarity metrics

Constructor Parameters

options?:

ContentSimilarityOptions
= { ignoreCase: true, ignoreWhitespace: true }
Configuration options for similarity comparison

ContentSimilarityOptions

ignoreCase?:

boolean
= true
Whether to ignore case differences when comparing strings

ignoreWhitespace?:

boolean
= true
Whether to normalize whitespace when comparing strings

measure() Parameters

input:

string
The reference text to compare against

output:

string
The text to evaluate for similarity

Returns

score:

number
Similarity score (0-1) where 1 indicates perfect similarity

info:

object
Detailed similarity metrics
number

similarity:

number
Raw similarity score between the two texts

Scoring Details

The metric evaluates textual similarity through character-level matching and configurable text normalization.

Scoring Process

  1. Normalizes text:

    • Case normalization (if ignoreCase: true)
    • Whitespace normalization (if ignoreWhitespace: true)
  2. Compares processed strings using string-similarity algorithm:

    • Analyzes character sequences
    • Aligns word boundaries
    • Considers relative positions
    • Accounts for length differences

Final score: similarity_value * scale

Score interpretation

(0 to scale, default 0-1)

  • 1.0: Perfect match - identical texts
  • 0.7-0.9: High similarity - mostly matching content
  • 0.4-0.6: Moderate similarity - partial matches
  • 0.1-0.3: Low similarity - few matching patterns
  • 0.0: No similarity - completely different texts

Example with Different Options

import { ContentSimilarityMetric } from "@mastra/evals/nlp";
 
// Case-sensitive comparison
const caseSensitiveMetric = new ContentSimilarityMetric({
  ignoreCase: false,
  ignoreWhitespace: true
});
 
const result1 = await caseSensitiveMetric.measure(
  "Hello World",
  "hello world"
); // Lower score due to case difference
 
// Example output:
// {
//   score: 0.75,
//   info: { similarity: 0.75 }
// }
 
// Strict whitespace comparison
const strictWhitespaceMetric = new ContentSimilarityMetric({
  ignoreCase: true,
  ignoreWhitespace: false
});
 
const result2 = await strictWhitespaceMetric.measure(
  "Hello   World",
  "Hello World"
); // Lower score due to whitespace difference
 
// Example output:
// {
//   score: 0.85,
//   info: { similarity: 0.85 }
// }