Custom Native JavaScript Scorer

This example shows how to create a custom evaluation scorer using JavaScript logic. The scorer accepts a query and a response, and returns a score and an analyzeStepResult object containing the total and matched words.

Installation


npm install @mastra/evals

For complete API documentation and configuration options, see createScorer.

Create a custom scorer

A custom scorer in Mastra can use native JavaScript methods to evaluate conditions.

See createScorer for the full API and configuration options.

src/mastra/evals/word-inclusion.ts


import { createScorer } from "@mastra/core/scores";
 
const wordInclusionScorer = createScorer({
  name: 'Word Inclusion',
  description: 'Check if the output contains any of the words in the input',
  analyze: async ({ input, output }) => {
    const tokenize = (text: string) => text.toLowerCase().match(/\b\w+\b/g) || [];
    const inputText = input[0].query.toLowerCase();
    const referenceWords = [...new Set(tokenize(inputText))];
    const outputText = output.text.toLowerCase();
    const matchedWords = referenceWords.filter(word => outputText.includes(word));
    const totalWords = referenceWords.length;
    const score = totalWords > 0 ? matchedWords.length / totalWords : 0;
    return {
      score,
      result: {
        totalWords,
        matchedWords: matchedWords.length,
      },
    };
  },
});

Note: The analyze function should return { score, result: { ... } }. When you call .run(), the returned object will have an analyzeStepResult field containing your result object.

High custom example

In this example, the response contains all the words listed in the input query. The scorer returns a high score indicating complete word inclusion.

src/example-high-word-inclusion.ts


const query = 'apple, banana, orange';
const response = 'My favorite fruits are: apple, banana, and orange.';
const result = await wordInclusionScorer.run({
  input: [{ query }],
  output: { text: response },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);

High custom output


{
  score: 1,
  analyzeStepResult: {
    totalWords: 3,
    matchedWords: 3
  }
}

Partial custom example

In this example, the response includes some but not all of the words from the input query. The scorer returns a partial score reflecting this incomplete word coverage.

src/example-partial-word-inclusion.ts


const query = 'programming, python, javascript, java';
const response = 'I love programming with python for data science projects.';
const result = await wordInclusionScorer.run({
  input: [{ query }],
  output: { text: response },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);

Partial custom output


{
  score: 0.5,
  analyzeStepResult: {
    totalWords: 4,
    matchedWords: 2
  }
}

Low custom example

In this example, the response does not contain any of the words from the input query. The scorer returns a low score indicating no word inclusion.

src/example-low-word-inclusion.ts


const query = 'soccer, basketball, tennis';
const response = 'I enjoy reading books and watching movies in my free time.';
const result = await wordInclusionScorer.run({
  input: [{ query }],
  output: { text: response },
});
console.log('Score:', result.score);
console.log('AnalyzeStepResult:', result.analyzeStepResult);

Low custom output


{
  score: 0,
  analyzeStepResult: {
    totalWords: 3,
    matchedWords: 0
  }
}

Understanding the results

.run() returns a result in the following shape:


{
  runId: string,
  analyzeStepResult: {
    totalWords: number,
    matchedWords: number
  },
  score: number
}

score

A score between 0 and 1:

1.0: The response includes all words from the input.
0.5–0.9: The response includes some but not all words.
0.0: None of the input words appear in the response.

runId

The unique identifier for this scorer run.

analyzeStepResult

Object with word inclusion statistics:

totalWords: The number of unique words found in the input.
matchedWords: The count of those words that also appear in the response.

View Example on GitHub