# Scorer Utils

Mastra provides utility functions to help extract and process data from scorer run inputs and outputs. These utilities are particularly useful in the `preprocess` step of custom scorers.

## Import

```typescript
import {
  getAssistantMessageFromRunOutput,
  getReasoningFromRunOutput,
  getUserMessageFromRunInput,
  getSystemMessagesFromRunInput,
  getCombinedSystemPrompt,
  extractToolCalls,
  extractInputMessages,
  extractAgentResponseMessages,
} from "@mastra/evals/scorers/utils";
```

## Message Extraction

### getAssistantMessageFromRunOutput

Extracts the text content from the first assistant message in the run output.

```typescript
const scorer = createScorer({
  id: "my-scorer",
  description: "My scorer",
  type: "agent",
})
  .preprocess(({ run }) => {
    const response = getAssistantMessageFromRunOutput(run.output);
    return { response };
  })
  .generateScore(({ results }) => {
    return results.preprocessStepResult?.response ? 1 : 0;
  });
```

**output?:** (`ScorerRunOutputForAgent`): The scorer run output (array of MastraDBMessage)

**Returns:** `string | undefined` - The assistant message text, or undefined if no assistant message is found.

### getUserMessageFromRunInput

Extracts the text content from the first user message in the run input.

```typescript
.preprocess(({ run }) => {
  const userMessage = getUserMessageFromRunInput(run.input);
  return { userMessage };
})
```

**input?:** (`ScorerRunInputForAgent`): The scorer run input containing input messages

**Returns:** `string | undefined` - The user message text, or undefined if no user message is found.

### extractInputMessages

Extracts text content from all input messages as an array.

```typescript
.preprocess(({ run }) => {
  const allUserMessages = extractInputMessages(run.input);
  return { conversationHistory: allUserMessages.join("\n") };
})
```

**Returns:** `string[]` - Array of text strings from each input message.

### extractAgentResponseMessages

Extracts text content from all assistant response messages as an array.

```typescript
.preprocess(({ run }) => {
  const allResponses = extractAgentResponseMessages(run.output);
  return { allResponses };
})
```

**Returns:** `string[]` - Array of text strings from each assistant message.

## Reasoning Extraction

### getReasoningFromRunOutput

Extracts reasoning text from the run output. This is particularly useful when evaluating responses from reasoning models like `deepseek-reasoner` that produce chain-of-thought reasoning.

Reasoning can be stored in two places:

1. `content.reasoning` - a string field on the message content
2. `content.parts` - as parts with `type: 'reasoning'` containing `details`

```typescript
import {
  getReasoningFromRunOutput,
  getAssistantMessageFromRunOutput
} from "@mastra/evals/scorers/utils";

const reasoningQualityScorer = createScorer({
  id: "reasoning-quality",
  name: "Reasoning Quality",
  description: "Evaluates the quality of model reasoning",
  type: "agent",
})
  .preprocess(({ run }) => {
    const reasoning = getReasoningFromRunOutput(run.output);
    const response = getAssistantMessageFromRunOutput(run.output);
    return { reasoning, response };
  })
  .analyze(({ results }) => {
    const { reasoning } = results.preprocessStepResult || {};
    return {
      hasReasoning: !!reasoning,
      reasoningLength: reasoning?.length || 0,
      hasStepByStep: reasoning?.includes("step") || false,
    };
  })
  .generateScore(({ results }) => {
    const { hasReasoning, reasoningLength } = results.analyzeStepResult || {};
    if (!hasReasoning) return 0;
    // Score based on reasoning length (normalized to 0-1)
    return Math.min(reasoningLength / 500, 1);
  })
  .generateReason(({ results, score }) => {
    const { hasReasoning, reasoningLength } = results.analyzeStepResult || {};
    if (!hasReasoning) {
      return "No reasoning was provided by the model.";
    }
    return `Model provided ${reasoningLength} characters of reasoning. Score: ${score}`;
  });
```

**output?:** (`ScorerRunOutputForAgent`): The scorer run output (array of MastraDBMessage)

**Returns:** `string | undefined` - The reasoning text, or undefined if no reasoning is present.

## System Message Extraction

### getSystemMessagesFromRunInput

Extracts all system messages from the run input, including both standard system messages and tagged system messages (specialized prompts like memory instructions).

```typescript
.preprocess(({ run }) => {
  const systemMessages = getSystemMessagesFromRunInput(run.input);
  return {
    systemPromptCount: systemMessages.length,
    systemPrompts: systemMessages
  };
})
```

**Returns:** `string[]` - Array of system message strings.

### getCombinedSystemPrompt

Combines all system messages into a single prompt string, joined with double newlines.

```typescript
.preprocess(({ run }) => {
  const fullSystemPrompt = getCombinedSystemPrompt(run.input);
  return { fullSystemPrompt };
})
```

**Returns:** `string` - Combined system prompt string.

## Tool Call Extraction

### extractToolCalls

Extracts information about all tool calls from the run output, including tool names, call IDs, and their positions in the message array.

```typescript
const toolUsageScorer = createScorer({
  id: "tool-usage",
  description: "Evaluates tool usage patterns",
  type: "agent",
})
  .preprocess(({ run }) => {
    const { tools, toolCallInfos } = extractToolCalls(run.output);
    return {
      toolsUsed: tools,
      toolCount: tools.length,
      toolDetails: toolCallInfos,
    };
  })
  .generateScore(({ results }) => {
    const { toolCount } = results.preprocessStepResult || {};
    // Score based on appropriate tool usage
    return toolCount > 0 ? 1 : 0;
  });
```

**Returns:**

```typescript
{
  tools: string[];           // Array of tool names
  toolCallInfos: ToolCallInfo[];  // Detailed tool call information
}
```

Where `ToolCallInfo` is:

```typescript
type ToolCallInfo = {
  toolName: string;      // Name of the tool
  toolCallId: string;    // Unique call identifier
  messageIndex: number;  // Index in the output array
  invocationIndex: number; // Index within message's tool invocations
};
```

## Test Utilities

These utilities help create test data for scorer development.

### createTestMessage

Creates a `MastraDBMessage` object for testing purposes.

```typescript
import { createTestMessage } from "@mastra/evals/scorers/utils";

const userMessage = createTestMessage({
  content: "What is the weather?",
  role: "user",
});

const assistantMessage = createTestMessage({
  content: "The weather is sunny.",
  role: "assistant",
  toolInvocations: [
    {
      toolCallId: "call-1",
      toolName: "weatherTool",
      args: { location: "London" },
      result: { temp: 20 },
      state: "result",
    },
  ],
});
```

### createAgentTestRun

Creates a complete test run object for testing scorers.

```typescript
import { createAgentTestRun, createTestMessage } from "@mastra/evals/scorers/utils";

const testRun = createAgentTestRun({
  inputMessages: [
    createTestMessage({ content: "Hello", role: "user" }),
  ],
  output: [
    createTestMessage({ content: "Hi there!", role: "assistant" }),
  ],
});

// Run your scorer with the test data
const result = await myScorer.run({
  input: testRun.input,
  output: testRun.output,
});
```

## Complete Example

Here's a complete example showing how to use multiple utilities together:

```typescript
import { createScorer } from "@mastra/core/evals";
import {
  getAssistantMessageFromRunOutput,
  getReasoningFromRunOutput,
  getUserMessageFromRunInput,
  getCombinedSystemPrompt,
  extractToolCalls,
} from "@mastra/evals/scorers/utils";

const comprehensiveScorer = createScorer({
  id: "comprehensive-analysis",
  name: "Comprehensive Analysis",
  description: "Analyzes all aspects of an agent response",
  type: "agent",
})
  .preprocess(({ run }) => {
    // Extract all relevant data
    const userMessage = getUserMessageFromRunInput(run.input);
    const response = getAssistantMessageFromRunOutput(run.output);
    const reasoning = getReasoningFromRunOutput(run.output);
    const systemPrompt = getCombinedSystemPrompt(run.input);
    const { tools, toolCallInfos } = extractToolCalls(run.output);

    return {
      userMessage,
      response,
      reasoning,
      systemPrompt,
      toolsUsed: tools,
      toolCount: tools.length,
    };
  })
  .generateScore(({ results }) => {
    const { response, reasoning, toolCount } = results.preprocessStepResult || {};

    let score = 0;
    if (response && response.length > 0) score += 0.4;
    if (reasoning) score += 0.3;
    if (toolCount > 0) score += 0.3;

    return score;
  })
  .generateReason(({ results, score }) => {
    const { response, reasoning, toolCount } = results.preprocessStepResult || {};

    const parts = [];
    if (response) parts.push("provided a response");
    if (reasoning) parts.push("included reasoning");
    if (toolCount > 0) parts.push(`used ${toolCount} tool(s)`);

    return `Score: ${score}. The agent ${parts.join(", ")}.`;
  });
```