Skip to Content
ExamplesEvalsNative JavaScript

Custom Native JavaScript Evaluation

This example shows how to create a custom evaluation metric using JavaScript logic. The metric accepts a query and a response, and returns a score and an info object containing the total and matched words.

Installation

npm install @mastra/evals

Create a custom eval

A custom eval in Mastra can use native JavaScript methods to evaluate conditions.

src/mastra/evals/example-word-inclusion.ts
import { Metric, type MetricResult } from "@mastra/core"; export class WordInclusionMetric extends Metric { constructor() { super(); } async measure(input: string, output: string): Promise<MetricResult> { const tokenize = (text: string) => text.toLowerCase().match(/\b\w+\b/g) || []; const referenceWords = [...new Set(tokenize(input))]; const outputText = output.toLowerCase(); const matchedWords = referenceWords.filter((word) => outputText.includes(word)); const totalWords = referenceWords.length; const score = totalWords > 0 ? matchedWords.length / totalWords : 0; return { score, info: { totalWords, matchedWords: matchedWords.length } }; } }

High custom example

In this example, the response contains all the words listed in the input query. The metric returns a high score indicating complete word inclusion.

src/example-high-word-inclusion.ts
import { WordInclusionMetric } from "./mastra/evals/example-word-inclusion"; const metric = new WordInclusionMetric(); const query = "apple, banana, orange"; const response = "My favorite fruits are: apple, banana, and orange."; const result = await metric.measure(query, response); console.log(result);

High custom output

The output receives a high score because all the unique words from the input are present in the response, demonstrating full coverage.

{ score: 1, info: { totalWords: 3, matchedWords: 3 } }

Partial custom example

In this example, the response includes some but not all of the words from the input query. The metric returns a partial score reflecting this incomplete word coverage.

src/example-partial-word-inclusion.ts
import { WordInclusionMetric } from "./mastra/evals/example-word-inclusion"; const metric = new WordInclusionMetric(); const query = "cats, dogs, rabbits"; const response = "I like dogs and rabbits"; const result = await metric.measure(query, response); console.log(result);

Partial custom output

The score reflects partial success because the response contains only a subset of the unique words from the input, indicating incomplete word inclusion.

{ score: 0.6666666666666666, info: { totalWords: 3, matchedWords: 2 } }

Low custom example

In this example, the response does not contain any of the words from the input query. The metric returns a low score indicating no word inclusion.

src/example-low-word-inclusion.ts
import { WordInclusionMetric } from "./mastra/evals/example-word-inclusion"; const metric = new WordInclusionMetric(); const query = "Colombia, Brazil, Panama"; const response = "Let's go to Mexico"; const result = await metric.measure(query, response); console.log(result);

Low custom output

The score is 0 because none of the unique words from the input appear in the response, indicating no overlap between the texts.

{ score: 0, info: { totalWords: 3, matchedWords: 0 } }

Understanding the results

WordInclusionMetric returns a result in the following shape:

{ score: number, info: { totalWords: number, matchedWords: number } }

Custom score

A score between 0 and 1:

  • 1.0: The response includes all words from the input.
  • 0.5–0.9: The response includes some but not all words.
  • 0.0: None of the input words appear in the response.

Custom info

An explanation for the score, with details including:

  • totalWords is the number of unique words found in the input.
  • matchedWords is the count of those words that also appear in the response.
  • The score is calculated as matchedWords / totalWords.
  • If no valid words are found in the input, the score defaults to 0.
View Example on GitHub