Agent.stream()

The .stream() method enables real-time streaming of responses from an agent with enhanced capabilities and format flexibility. This method accepts messages and optional streaming options, providing a next-generation streaming experience with support for both Mastra’s native format and AI SDK v5 compatibility.

Usage example

index.ts


// Default Mastra format
const mastraStream = await agent.stream("message for agent");
 
// AI SDK v5 compatible format
const aiSdkStream = await agent.stream("message for agent", {
  format: 'aisdk'
});

Model Compatibility: This method is designed for V2 models. V1 models should use the .streamLegacy() method. The framework automatically detects your model version and will throw an error if there’s a mismatch.

Parameters

messages:

string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]

The messages to send to the agent. Can be a single string, array of strings, or structured message objects.

options?:

AgentExecutionOptions<Output, StructuredOutput, Format>

Optional configuration for the streaming process.

Options

format?:

'mastra' | 'aisdk'

= 'mastra'

Determines the output stream format. Use 'mastra' for Mastra's native format (default) or 'aisdk' for AI SDK v5 compatibility.

maxSteps?:

number

Maximum number of steps to run during execution.

scorers?:

MastraScorers | Record<string, { scorer: MastraScorer['name']; sampling?: ScoringSamplingConfig }>

Evaluation scorers to run on the execution results.

scorer:

string

Name of the scorer to use.

sampling?:

ScoringSamplingConfig

Sampling configuration for the scorer.

tracingContext?:

TracingContext

AI tracing context for span hierarchy and metadata.

returnScorerData?:

boolean

Whether to return detailed scoring data in the response.

onChunk?:

(chunk: ChunkType) => Promise<void> | void

Callback function called for each chunk during streaming.

onError?:

({ error }: { error: Error | string }) => Promise<void> | void

Callback function called when an error occurs during streaming.

onAbort?:

(event: any) => Promise<void> | void

Callback function called when the stream is aborted.

abortSignal?:

AbortSignal

Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.

activeTools?:

Array<keyof ToolSet> | undefined

Array of active tool names that can be used during execution.

prepareStep?:

PrepareStepFunction<any>

Callback function called before each step of multi-step execution.

context?:

ModelMessage[]

Additional context messages to provide to the agent.

structuredOutput?:

StructuredOutputOptions<S extends ZodTypeAny = ZodTypeAny>

Enables structured output generation with better developer experience. Automatically creates and uses a StructuredOutputProcessor internally.

schema:

z.ZodSchema<S>

Zod schema to validate the output against.

model?:

MastraLanguageModel

Model to use for the internal structuring agent. If not provided, falls back to the agent's model.

errorStrategy?:

'strict' | 'warn' | 'fallback'

Strategy when parsing or validation fails. Defaults to 'strict'.

fallbackValue?:

Fallback value when errorStrategy is 'fallback'.

instructions?:

string

Custom instructions for the structuring agent.

outputProcessors?:

Processor[]

Overrides the output processors set on the agent. Output processors that can modify or validate messages from the agent before they are returned to the user. Must implement either (or both) of the `processOutputResult` and `processOutputStream` functions.

inputProcessors?:

Processor[]

Overrides the input processors set on the agent. Input processors that can modify or validate messages before they are processed by the agent. Must implement the `processInput` function.

instructions?:

string

Custom instructions that override the agent's default instructions for this specific generation. Useful for dynamically modifying agent behavior without creating a new agent instance.

system?:

Custom system message(s) to include in the prompt. Can be a single string, message object, or array of either. System messages provide additional context or behavior instructions that supplement the agent's main instructions.

output?:

Zod schema | JsonSchema7

**Deprecated.** Use structuredOutput with maxSteps:1 to achieve the same thing. Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.

memory?:

object

Configuration for memory. This is the preferred way to manage memory.

thread:

string | { id: string; metadata?: Record<string, any>, title?: string }

The conversation thread, as a string ID or an object with an `id` and optional `metadata`.

resource:

string

Identifier for the user or resource associated with the thread.

options?:

MemoryConfig

Configuration for memory behavior, like message history and semantic recall.

onFinish?:

StreamTextOnFinishCallback<any> | StreamObjectOnFinishCallback<OUTPUT>

Callback function called when streaming completes. Receives the final result.

onStepFinish?:

StreamTextOnStepFinishCallback<any> | never

Callback function called after each execution step. Receives step details as a JSON string. Unavailable for structured output

resourceId?:

string

**Deprecated.** Use `memory.resource` instead. Identifier for the user or resource interacting with the agent. Must be provided if threadId is provided.

telemetry?:

TelemetrySettings

Settings for OTLP telemetry collection during streaming (not AI tracing).

isEnabled?:

boolean

Enable or disable telemetry. Disabled by default while experimental.

recordInputs?:

boolean

Enable or disable input recording. Enabled by default. You might want to disable input recording to avoid recording sensitive information.

recordOutputs?:

boolean

Enable or disable output recording. Enabled by default. You might want to disable output recording to avoid recording sensitive information.

functionId?:

string

Identifier for this function. Used to group telemetry data by function.

modelSettings?:

CallSettings

Model-specific settings like temperature, maxTokens, topP, etc. These are passed to the underlying language model.

temperature?:

number

Controls randomness in the model's output. Higher values (e.g., 0.8) make the output more random, lower values (e.g., 0.2) make it more focused and deterministic.

maxRetries?:

number

Maximum number of retries for failed requests.

topP?:

number

Nucleus sampling. This is a number between 0 and 1. It is recommended to set either temperature or topP, but not both.

topK?:

number

Only sample from the top K options for each subsequent token. Used to remove 'long tail' low probability responses.

presencePenalty?:

number

Presence penalty setting. It affects the likelihood of the model to repeat information that is already in the prompt. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).

frequencyPenalty?:

number

Frequency penalty setting. It affects the likelihood of the model to repeatedly use the same words or phrases. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).

stopSequences?:

string[]

Stop sequences. If set, the model will stop generating text when one of the stop sequences is generated.

threadId?:

string

**Deprecated.** Use `memory.thread` instead. Identifier for the conversation thread. Allows for maintaining context across multiple interactions. Must be provided if resourceId is provided.

toolChoice?:

'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }

= 'auto'

Controls how the agent uses tools during streaming.

'auto':

string

Let the model decide whether to use tools (default).

'none':

string

Do not use any tools.

'required':

string

Require the model to use at least one tool.

{ type: 'tool'; toolName: string }:

object

Require the model to use a specific tool by name.

toolsets?:

ToolsetsInput

Additional toolsets to make available to the agent during streaming.

clientTools?:

ToolsInput

Tools that are executed on the 'client' side of the request. These tools do not have execute functions in the definition.

savePerStep?:

boolean

Save messages incrementally after each stream step completes (default: false).

providerOptions?:

Record<string, Record<string, JSONValue>>

Additional provider-specific options that are passed through to the underlying LLM provider. The structure is `{ providerName: { optionKey: value } }`. For example: `{ openai: { reasoningEffort: 'high' }, anthropic: { maxTokens: 1000 } }`.

openai?:

Record<string, JSONValue>

OpenAI-specific options. Example: `{ reasoningEffort: 'high' }`

anthropic?:

Record<string, JSONValue>

Anthropic-specific options. Example: `{ maxTokens: 1000 }`

google?:

Record<string, JSONValue>

Google-specific options. Example: `{ safetySettings: [...] }`

[providerName]?:

Record<string, JSONValue>

Other provider-specific options. The key is the provider name and the value is a record of provider-specific options.

runId?:

string

Unique ID for this generation run. Useful for tracking and debugging purposes.

runtimeContext?:

RuntimeContext

Runtime context for dependency injection and contextual information.

tracingContext?:

TracingContext

AI tracing context for creating child spans and adding metadata. Automatically injected when using Mastra's tracing system.

currentSpan?:

AISpan

Current AI span for creating child spans and adding metadata. Use this to create custom child spans or update span attributes during execution.

tracingOptions?:

TracingOptions

Options for AI tracing configuration.

metadata?:

Record<string, any>

Metadata to add to the root trace span. Useful for adding custom attributes like user IDs, session IDs, or feature flags.

maxTokens?:

number

Condition(s) that determine when to stop the agent's execution. Can be a single condition or array of conditions.

Returns

stream:

MastraModelOutput<Output> | AISDKV5OutputStream<Output>

Returns a streaming interface based on the format parameter. When format is 'mastra' (default), returns MastraModelOutput. When format is 'aisdk', returns AISDKV5OutputStream for AI SDK v5 compatibility.

traceId?:

string

The trace ID associated with this execution when AI tracing is enabled. Use this to correlate logs and debug execution flow.

Extended usage example

Mastra Format (Default)

index.ts


import { stepCountIs } from 'ai-v5';
 
const stream = await agent.stream("Tell me a story", {
  stopWhen: stepCountIs(3), // Stop after 3 steps
  modelSettings: {
    temperature: 0.7,
  },
});
 
// Access text stream
for await (const chunk of stream.textStream) {
  console.log(chunk);
}
 
// Get full text after streaming
const fullText = await stream.text;

AI SDK v5 Format

index.ts


import { stepCountIs } from 'ai-v5';
 
const stream = await agent.stream("Tell me a story", {
  format: 'aisdk',
  stopWhen: stepCountIs(3), // Stop after 3 steps
  modelSettings: {
    temperature: 0.7,
  },
});
 
// Use with AI SDK v5 compatible interfaces
for await (const part of stream.fullStream) {
  if (part.type === 'text-delta') {
    console.log(part.text);
  }
}
 
// In an API route for frontend integration
return stream.toUIMessageStreamResponse();

Using Callbacks

All callback functions are now available as top-level properties for a cleaner API experience.

index.ts


const stream = await agent.stream("Tell me a story", {
  onFinish: (result) => {
    console.log('Streaming finished:', result);
  },
  onStepFinish: (step) => {
    console.log('Step completed:', step);
  },
  onChunk: (chunk) => {
    console.log('Received chunk:', chunk);
  },
  onError: ({ error }) => {
    console.error('Streaming error:', error);
  },
  onAbort: (event) => {
    console.log('Stream aborted:', event);
  },
});
 
// Process the stream
for await (const chunk of stream.textStream) {
  console.log(chunk);
}

Advanced Example with Options

index.ts


import { z } from "zod";
import { stepCountIs } from 'ai-v5';
 
await agent.stream("message for agent", {
  format: 'aisdk', // Enable AI SDK v5 compatibility
  stopWhen: stepCountIs(3), // Stop after 3 steps
  modelSettings: {
    temperature: 0.7,
  },
  memory: {
    thread: "user-123",
    resource: "test-app"
  },
  toolChoice: "auto",
  // Structured output with better DX
  structuredOutput: {
    schema: z.object({
      sentiment: z.enum(['positive', 'negative', 'neutral']),
      confidence: z.number(),
    }),
    model: openai("gpt-4o-mini"),
    errorStrategy: 'warn',
  },
  // Output processors for streaming response validation
  outputProcessors: [
    new ModerationProcessor({ model: openai("gpt-4.1-nano") }),
    new BatchPartsProcessor({ maxBatchSize: 3, maxWaitTime: 100 }),
  ],
});

Agent.stream()

Usage example

index.ts


// Default Mastra format
const mastraStream = await agent.stream("message for agent");
 
// AI SDK v5 compatible format
const aiSdkStream = await agent.stream("message for agent", {
  format: 'aisdk'
});

Parameters

messages:

string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]

The messages to send to the agent. Can be a single string, array of strings, or structured message objects.

options?:

AgentExecutionOptions<Output, StructuredOutput, Format>

Optional configuration for the streaming process.

Options

format?:

'mastra' | 'aisdk'

= 'mastra'

Determines the output stream format. Use 'mastra' for Mastra's native format (default) or 'aisdk' for AI SDK v5 compatibility.

maxSteps?:

number

Maximum number of steps to run during execution.

scorers?:

MastraScorers | Record<string, { scorer: MastraScorer['name']; sampling?: ScoringSamplingConfig }>

Evaluation scorers to run on the execution results.

scorer:

string

Name of the scorer to use.

sampling?:

ScoringSamplingConfig

Sampling configuration for the scorer.

tracingContext?:

TracingContext

AI tracing context for span hierarchy and metadata.

returnScorerData?:

boolean

Whether to return detailed scoring data in the response.

onChunk?:

(chunk: ChunkType) => Promise<void> | void

Callback function called for each chunk during streaming.

onError?:

({ error }: { error: Error | string }) => Promise<void> | void

Callback function called when an error occurs during streaming.

onAbort?:

(event: any) => Promise<void> | void

Callback function called when the stream is aborted.

abortSignal?:

AbortSignal

Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.

activeTools?:

Array<keyof ToolSet> | undefined

Array of active tool names that can be used during execution.

prepareStep?:

PrepareStepFunction<any>

Callback function called before each step of multi-step execution.

context?:

ModelMessage[]

Additional context messages to provide to the agent.

structuredOutput?:

StructuredOutputOptions<S extends ZodTypeAny = ZodTypeAny>

Enables structured output generation with better developer experience. Automatically creates and uses a StructuredOutputProcessor internally.

schema:

z.ZodSchema<S>

Zod schema to validate the output against.

model?:

MastraLanguageModel

Model to use for the internal structuring agent. If not provided, falls back to the agent's model.

errorStrategy?:

'strict' | 'warn' | 'fallback'

Strategy when parsing or validation fails. Defaults to 'strict'.

fallbackValue?:

Fallback value when errorStrategy is 'fallback'.

instructions?:

string

Custom instructions for the structuring agent.

outputProcessors?:

Processor[]

inputProcessors?:

Processor[]

Overrides the input processors set on the agent. Input processors that can modify or validate messages before they are processed by the agent. Must implement the `processInput` function.

instructions?:

string

Custom instructions that override the agent's default instructions for this specific generation. Useful for dynamically modifying agent behavior without creating a new agent instance.

system?:

output?:

Zod schema | JsonSchema7

**Deprecated.** Use structuredOutput with maxSteps:1 to achieve the same thing. Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.

memory?:

object

Configuration for memory. This is the preferred way to manage memory.

thread:

string | { id: string; metadata?: Record<string, any>, title?: string }

The conversation thread, as a string ID or an object with an `id` and optional `metadata`.

resource:

string

Identifier for the user or resource associated with the thread.

options?:

MemoryConfig

Configuration for memory behavior, like message history and semantic recall.

onFinish?:

StreamTextOnFinishCallback<any> | StreamObjectOnFinishCallback<OUTPUT>

Callback function called when streaming completes. Receives the final result.

onStepFinish?:

StreamTextOnStepFinishCallback<any> | never

Callback function called after each execution step. Receives step details as a JSON string. Unavailable for structured output

resourceId?:

string

**Deprecated.** Use `memory.resource` instead. Identifier for the user or resource interacting with the agent. Must be provided if threadId is provided.

telemetry?:

TelemetrySettings

Settings for OTLP telemetry collection during streaming (not AI tracing).

isEnabled?:

boolean

Enable or disable telemetry. Disabled by default while experimental.

recordInputs?:

boolean

Enable or disable input recording. Enabled by default. You might want to disable input recording to avoid recording sensitive information.

recordOutputs?:

boolean

Enable or disable output recording. Enabled by default. You might want to disable output recording to avoid recording sensitive information.

functionId?:

string

Identifier for this function. Used to group telemetry data by function.

modelSettings?:

CallSettings

Model-specific settings like temperature, maxTokens, topP, etc. These are passed to the underlying language model.

temperature?:

number

Controls randomness in the model's output. Higher values (e.g., 0.8) make the output more random, lower values (e.g., 0.2) make it more focused and deterministic.

maxRetries?:

number

Maximum number of retries for failed requests.

topP?:

number

Nucleus sampling. This is a number between 0 and 1. It is recommended to set either temperature or topP, but not both.

topK?:

number

Only sample from the top K options for each subsequent token. Used to remove 'long tail' low probability responses.

presencePenalty?:

number

frequencyPenalty?:

number

Frequency penalty setting. It affects the likelihood of the model to repeatedly use the same words or phrases. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).

stopSequences?:

string[]

Stop sequences. If set, the model will stop generating text when one of the stop sequences is generated.

threadId?:

string

**Deprecated.** Use `memory.thread` instead. Identifier for the conversation thread. Allows for maintaining context across multiple interactions. Must be provided if resourceId is provided.

toolChoice?:

'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }

= 'auto'

Controls how the agent uses tools during streaming.

'auto':

string

Let the model decide whether to use tools (default).

'none':

string

Do not use any tools.

'required':

string

Require the model to use at least one tool.

{ type: 'tool'; toolName: string }:

object

Require the model to use a specific tool by name.

toolsets?:

ToolsetsInput

Additional toolsets to make available to the agent during streaming.

clientTools?:

ToolsInput

Tools that are executed on the 'client' side of the request. These tools do not have execute functions in the definition.

savePerStep?:

boolean

Save messages incrementally after each stream step completes (default: false).

providerOptions?:

Record<string, Record<string, JSONValue>>

openai?:

Record<string, JSONValue>

OpenAI-specific options. Example: `{ reasoningEffort: 'high' }`

anthropic?:

Record<string, JSONValue>

Anthropic-specific options. Example: `{ maxTokens: 1000 }`

google?:

Record<string, JSONValue>

Google-specific options. Example: `{ safetySettings: [...] }`

[providerName]?:

Record<string, JSONValue>

Other provider-specific options. The key is the provider name and the value is a record of provider-specific options.

runId?:

string

Unique ID for this generation run. Useful for tracking and debugging purposes.

runtimeContext?:

RuntimeContext

Runtime context for dependency injection and contextual information.

tracingContext?:

TracingContext

AI tracing context for creating child spans and adding metadata. Automatically injected when using Mastra's tracing system.

currentSpan?:

AISpan

Current AI span for creating child spans and adding metadata. Use this to create custom child spans or update span attributes during execution.

tracingOptions?:

TracingOptions

Options for AI tracing configuration.

metadata?:

Record<string, any>

Metadata to add to the root trace span. Useful for adding custom attributes like user IDs, session IDs, or feature flags.

maxTokens?:

number

Condition(s) that determine when to stop the agent's execution. Can be a single condition or array of conditions.

Returns

stream:

MastraModelOutput<Output> | AISDKV5OutputStream<Output>

traceId?:

string

The trace ID associated with this execution when AI tracing is enabled. Use this to correlate logs and debug execution flow.

Extended usage example

Mastra Format (Default)

index.ts


import { stepCountIs } from 'ai-v5';
 
const stream = await agent.stream("Tell me a story", {
  stopWhen: stepCountIs(3), // Stop after 3 steps
  modelSettings: {
    temperature: 0.7,
  },
});
 
// Access text stream
for await (const chunk of stream.textStream) {
  console.log(chunk);
}
 
// Get full text after streaming
const fullText = await stream.text;

AI SDK v5 Format

index.ts


import { stepCountIs } from 'ai-v5';
 
const stream = await agent.stream("Tell me a story", {
  format: 'aisdk',
  stopWhen: stepCountIs(3), // Stop after 3 steps
  modelSettings: {
    temperature: 0.7,
  },
});
 
// Use with AI SDK v5 compatible interfaces
for await (const part of stream.fullStream) {
  if (part.type === 'text-delta') {
    console.log(part.text);
  }
}
 
// In an API route for frontend integration
return stream.toUIMessageStreamResponse();

Using Callbacks

All callback functions are now available as top-level properties for a cleaner API experience.

index.ts


const stream = await agent.stream("Tell me a story", {
  onFinish: (result) => {
    console.log('Streaming finished:', result);
  },
  onStepFinish: (step) => {
    console.log('Step completed:', step);
  },
  onChunk: (chunk) => {
    console.log('Received chunk:', chunk);
  },
  onError: ({ error }) => {
    console.error('Streaming error:', error);
  },
  onAbort: (event) => {
    console.log('Stream aborted:', event);
  },
});
 
// Process the stream
for await (const chunk of stream.textStream) {
  console.log(chunk);
}

Advanced Example with Options

index.ts


import { z } from "zod";
import { stepCountIs } from 'ai-v5';
 
await agent.stream("message for agent", {
  format: 'aisdk', // Enable AI SDK v5 compatibility
  stopWhen: stepCountIs(3), // Stop after 3 steps
  modelSettings: {
    temperature: 0.7,
  },
  memory: {
    thread: "user-123",
    resource: "test-app"
  },
  toolChoice: "auto",
  // Structured output with better DX
  structuredOutput: {
    schema: z.object({
      sentiment: z.enum(['positive', 'negative', 'neutral']),
      confidence: z.number(),
    }),
    model: openai("gpt-4o-mini"),
    errorStrategy: 'warn',
  },
  // Output processors for streaming response validation
  outputProcessors: [
    new ModerationProcessor({ model: openai("gpt-4.1-nano") }),
    new BatchPartsProcessor({ maxBatchSize: 3, maxWaitTime: 100 }),
  ],
});