Agent.streamVNext() (Experimental)
⚠️
Experimental Feature: This is a new streaming implementation that will replace the existing stream()
method once battle-tested. The API may change as we refine the feature based on feedback.
The .streamVNext()
method enables real-time streaming of responses from an agent with enhanced capabilities. This method accepts messages and optional streaming options, providing a next-generation streaming experience that will eventually replace the current stream()
method.
Usage example
await agent.streamVNext("message for agent");
Parameters
messages:
string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]
The messages to send to the agent. Can be a single string, array of strings, or structured message objects.
options?:
AgentVNextStreamOptions<Output, StructuredOutput>
Optional configuration for the streaming process.
Options parameters
abortSignal?:
AbortSignal
Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.
context?:
CoreMessage[]
Additional context messages to provide to the agent.
structuredOutput?:
StructuredOutputOptions<S extends ZodTypeAny = ZodTypeAny>
Enables structured output generation with better developer experience. Automatically creates and uses a StructuredOutputProcessor internally.
schema:
z.ZodSchema<S>
Zod schema to validate the output against.
model:
MastraLanguageModel
Model to use for the internal structuring agent.
errorStrategy?:
'strict' | 'warn' | 'fallback'
Strategy when parsing or validation fails. Defaults to 'strict'.
fallbackValue?:
<S extends ZodTypeAny>
Fallback value when errorStrategy is 'fallback'.
instructions?:
string
Custom instructions for the structuring agent.
outputProcessors?:
Processor[]
Overrides the output processors set on the agent. Output processors that can modify or validate messages from the agent before they are returned to the user. Must implement either (or both) of the `processOutputResult` and `processOutputStream` functions.
inputProcessors?:
Processor[]
Overrides the input processors set on the agent. Input processors that can modify or validate messages before they are processed by the agent. Must implement the `processInput` function.
experimental_output?:
Zod schema | JsonSchema7
Note, the preferred route is to use the `structuredOutput` property. Enables structured output generation alongside text generation and tool calls. The model will generate responses that conform to the provided schema.
instructions?:
string
Custom instructions that override the agent's default instructions for this specific generation. Useful for dynamically modifying agent behavior without creating a new agent instance.
output?:
Zod schema | JsonSchema7
Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.
memory?:
object
Configuration for memory. This is the preferred way to manage memory.
thread:
string | { id: string; metadata?: Record<string, any>, title?: string }
The conversation thread, as a string ID or an object with an `id` and optional `metadata`.
resource:
string
Identifier for the user or resource associated with the thread.
options?:
MemoryConfig
Configuration for memory behavior, like message history and semantic recall.
maxSteps?:
number
= 5
Maximum number of execution steps allowed.
maxRetries?:
number
= 2
Maximum number of retries. Set to 0 to disable retries.
memoryOptions?:
MemoryConfig
**Deprecated.** Use `memory.options` instead. Configuration options for memory management.
onFinish?:
StreamTextOnFinishCallback<any> | StreamObjectOnFinishCallback<OUTPUT>
Callback function called when streaming completes. Receives the final result.
onStepFinish?:
StreamTextOnStepFinishCallback<any> | never
Callback function called after each execution step. Receives step details as a JSON string. Unavailable for structured output
resourceId?:
string
**Deprecated.** Use `memory.resource` instead. Identifier for the user or resource interacting with the agent. Must be provided if threadId is provided.
telemetry?:
TelemetrySettings
Settings for telemetry collection during streaming.
isEnabled?:
boolean
Enable or disable telemetry. Disabled by default while experimental.
recordInputs?:
boolean
Enable or disable input recording. Enabled by default. You might want to disable input recording to avoid recording sensitive information.
recordOutputs?:
boolean
Enable or disable output recording. Enabled by default. You might want to disable output recording to avoid recording sensitive information.
functionId?:
string
Identifier for this function. Used to group telemetry data by function.
temperature?:
number
Controls randomness in the model's output. Higher values (e.g., 0.8) make the output more random, lower values (e.g., 0.2) make it more focused and deterministic.
threadId?:
string
**Deprecated.** Use `memory.thread` instead. Identifier for the conversation thread. Allows for maintaining context across multiple interactions. Must be provided if resourceId is provided.
toolChoice?:
'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }
= 'auto'
Controls how the agent uses tools during streaming.
'auto':
string
Let the model decide whether to use tools (default).
'none':
string
Do not use any tools.
'required':
string
Require the model to use at least one tool.
{ type: 'tool'; toolName: string }:
object
Require the model to use a specific tool by name.
toolsets?:
ToolsetsInput
Additional toolsets to make available to the agent during streaming.
clientTools?:
ToolsInput
Tools that are executed on the 'client' side of the request. These tools do not have execute functions in the definition.
savePerStep?:
boolean
Save messages incrementally after each stream step completes (default: false).
providerOptions?:
Record<string, Record<string, JSONValue>>
Additional provider-specific options that are passed through to the underlying LLM provider. The structure is `{ providerName: { optionKey: value } }`. For example: `{ openai: { reasoningEffort: 'high' }, anthropic: { maxTokens: 1000 } }`.
openai?:
Record<string, JSONValue>
OpenAI-specific options. Example: `{ reasoningEffort: 'high' }`
anthropic?:
Record<string, JSONValue>
Anthropic-specific options. Example: `{ maxTokens: 1000 }`
google?:
Record<string, JSONValue>
Google-specific options. Example: `{ safetySettings: [...] }`
[providerName]?:
Record<string, JSONValue>
Other provider-specific options. The key is the provider name and the value is a record of provider-specific options.
runId?:
string
Unique ID for this generation run. Useful for tracking and debugging purposes.
runtimeContext?:
RuntimeContext
Runtime context for dependency injection and contextual information.
experimental_generateMessageId?:
IDGenerator
Generate a unique ID for each message.
maxTokens?:
number
Maximum number of tokens to generate.
topP?:
number
Nucleus sampling. This is a number between 0 and 1. It is recommended to set either `temperature` or `topP`, but not both.
topK?:
number
Only sample from the top K options for each subsequent token. Used to remove 'long tail' low probability responses.
presencePenalty?:
number
Presence penalty setting. It affects the likelihood of the model to repeat information that is already in the prompt. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).
frequencyPenalty?:
number
Frequency penalty setting. It affects the likelihood of the model to repeatedly use the same words or phrases. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).
stopSequences?:
string[]
Stop sequences. If set, the model will stop generating text when one of the stop sequences is generated.
seed?:
number
The seed (integer) to use for random sampling. If set and supported by the model, calls will generate deterministic results.
headers?:
Record<string, string | undefined>
Additional HTTP headers to be sent with the request. Only applicable for HTTP-based providers.
system?:
string
System message to include in the prompt. Can be used with `prompt` or `messages`.
prompt?:
string
A simple text prompt. You can either use `prompt` or `messages` but not both.
Returns
stream:
MastraAgentStream<Output extends ZodSchema ? z.infer<Output> : StructuredOutput extends ZodSchema ? z.infer<StructuredOutput> : unknown>
A streaming interface that provides enhanced streaming capabilities for text and structured output.
Extended usage example
import { z } from "zod";
import { ModerationProcessor, BatchPartsProcessor } from "@mastra/core/processors";
await agent.streamVNext("message for agent", {
temperature: 0.7,
maxSteps: 3,
memory: {
thread: "user-123",
resource: "test-app"
},
toolChoice: "auto",
// Structured output with better DX
structuredOutput: {
schema: z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number(),
}),
model: openai("gpt-4o-mini"),
errorStrategy: 'warn',
},
// Output processors for streaming response validation
outputProcessors: [
new ModerationProcessor({ model: openai("gpt-4.1-nano") }),
new BatchPartsProcessor({ maxBatchSize: 3, maxWaitTime: 100 }),
],
});