Skip to Content
ReferenceEvalsContentSimilarity

ContentSimilarityMetric

The ContentSimilarityMetric class measures the textual similarity between two strings, providing a score that indicates how closely they match. It supports configurable options for case sensitivity and whitespace handling.

Basic Usage

import { ContentSimilarityMetric } from "@mastra/evals/nlp"; const metric = new ContentSimilarityMetric({ ignoreCase: true, ignoreWhitespace: true }); const result = await metric.measure( "Hello, world!", "hello world" ); console.log(result.score); // Similarity score from 0-1 console.log(result.info); // Detailed similarity metrics

Constructor Parameters

options?:

ContentSimilarityOptions
= { ignoreCase: true, ignoreWhitespace: true }
Configuration options for similarity comparison

ContentSimilarityOptions

ignoreCase?:

boolean
= true
Whether to ignore case differences when comparing strings

ignoreWhitespace?:

boolean
= true
Whether to normalize whitespace when comparing strings

measure() Parameters

input:

string
The reference text to compare against

output:

string
The text to evaluate for similarity

Returns

score:

number
Similarity score (0-1) where 1 indicates perfect similarity

info:

object
Detailed similarity metrics
number

similarity:

number
Raw similarity score between the two texts

Scoring Details

The metric evaluates textual similarity through character-level matching and configurable text normalization.

Scoring Process

  1. Normalizes text:

    • Case normalization (if ignoreCase: true)
    • Whitespace normalization (if ignoreWhitespace: true)
  2. Compares processed strings using string-similarity algorithm:

    • Analyzes character sequences
    • Aligns word boundaries
    • Considers relative positions
    • Accounts for length differences

Final score: similarity_value * scale

Score interpretation

(0 to scale, default 0-1)

  • 1.0: Perfect match - identical texts
  • 0.7-0.9: High similarity - mostly matching content
  • 0.4-0.6: Moderate similarity - partial matches
  • 0.1-0.3: Low similarity - few matching patterns
  • 0.0: No similarity - completely different texts

Example with Different Options

import { ContentSimilarityMetric } from "@mastra/evals/nlp"; // Case-sensitive comparison const caseSensitiveMetric = new ContentSimilarityMetric({ ignoreCase: false, ignoreWhitespace: true }); const result1 = await caseSensitiveMetric.measure( "Hello World", "hello world" ); // Lower score due to case difference // Example output: // { // score: 0.75, // info: { similarity: 0.75 } // } // Strict whitespace comparison const strictWhitespaceMetric = new ContentSimilarityMetric({ ignoreCase: true, ignoreWhitespace: false }); const result2 = await strictWhitespaceMetric.measure( "Hello World", "Hello World" ); // Lower score due to whitespace difference // Example output: // { // score: 0.85, // info: { similarity: 0.85 } // }