createLLMScorer
The createLLMScorer()
function lets you define custom scorers that use a language model (LLM) as a judge for evaluation. LLM scorers are ideal for tasks where you want to use prompt-based evaluation, such as answer relevancy, faithfulness, or custom prompt-based metrics. LLM scorers integrate seamlessly with the Mastra scoring framework and can be used anywhere built-in scorers are used.
For a usage example, see the Custom LLM Judge Examples.
createLLMScorer Options
name:
description:
judge:
extract:
analyze:
reason:
calculateScore:
This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run()
method and its input/output.
Judge Object
model:
instructions:
Extract Object
description:
judge:
outputSchema:
createPrompt:
Analyze Object
description:
judge:
outputSchema:
createPrompt:
Calculate Score Function
The calculateScore
function converts the LLM’s structured analysis into a numerical score. This function receives the results from previous steps but not the score itself (since that’s what it calculates).
input:
output:
runtimeContext:
extractStepResult:
analyzeStepResult:
Returns: number
The function must return a numerical score, typically in the 0-1 range where 1 represents the best possible score.
Reason Object
description:
judge:
createPrompt:
LLM scorers may also include step-specific prompt fields in the return value, such as extractPrompt
, analyzePrompt
, and reasonPrompt
.