Introducing Cloneable Scorers

·

Sep 5, 2025

Introducing Cloneable Scorers

We have a bunch of new off-the-shelf evals (now called scorers). What's awesome about them is that they can easily be customized.

You can look at how we built our off-the-shelf scorers, examine the prompts and logic we use, then adapt them for your specific needs. We've already written all the scaffolding—look at how we did it, then customize it. (Or build something completely custom if you prefer.)

Typically there have been two separate approaches to evals. Either teams used off-the-shelf evals, like answer relevancy or faithfulness, that included a ton of built-in prompt engineering, but they couldn't modify.

Or they wrote their own prompt from scratch. They could do whatever they wanted, but it was a lot of work. Mastra supports both. But we also support cloneable scorers, which make it possible to quickly copy existing scorers into your project and easily tweak and customize them.

Custom scorers

Our scorers are open source. You can see exactly how we extract claims for faithfulness checking, how we detect hallucinations, or how we measure bias. Then adapt those patterns for yourself.

Here's the actual code for our Answer Relevancy scorer:

It's ~50 lines of code broken into a few steps, a preprocessing step, an analyze step, a generateScore step, and a generateReason step with two prompts.

But for your use-case you might want to modify the prompt. Maybe change the description in the analyze step, adjust the uncertaintyWeight parameter, or modify whatever else you need.

Cloneable Scorers: the shadcn-ui approach to Evals

We also shipped cloneable scorers. Now you can just do:

mastra add scorer faithfulness

This will copy our faithfulness scorer into your project. It's a better approach than merely importing because it gives you the ability to edit the scorer.

And you'll have the complete scoring pipeline with our detailed prompts to start from.

Looking ahead

You can write a custom scorer from scratch or clone our off-the-shelf scorers as a starting point. The implementation is yours to modify and extend.

Evals don't have to be mysterious. Look at how we do it, understand the patterns, then build what you need.

Stay up to date