Skip to Content

Agent.stream()

The .stream() method enables real-time streaming of responses from an agent. This method accepts messages and optional streaming options.

Usage example

await agent.stream("message for agent");

Parameters

messages:

string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]
The messages to send to the agent. Can be a single string, array of strings, or structured message objects.

options?:

AgentStreamOptions<OUTPUT, EXPERIMENTAL_OUTPUT>
Optional configuration for the streaming process.

Options parameters

abortSignal?:

AbortSignal
Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.

context?:

CoreMessage[]
Additional context messages to provide to the agent.

experimental_output?:

Zod schema | JsonSchema7
Enables structured output generation alongside text generation and tool calls. The model will generate responses that conform to the provided schema.

instructions?:

string
Custom instructions that override the agent's default instructions for this specific generation. Useful for dynamically modifying agent behavior without creating a new agent instance.

output?:

Zod schema | JsonSchema7
Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.

memory?:

object
Configuration for memory. This is the preferred way to manage memory.

thread:

string | { id: string; metadata?: Record<string, any>, title?: string }
The conversation thread, as a string ID or an object with an `id` and optional `metadata`.

resource:

string
Identifier for the user or resource associated with the thread.

options?:

MemoryConfig
Configuration for memory behavior, like message history and semantic recall.

maxSteps?:

number
= 5
Maximum number of execution steps allowed.

maxRetries?:

number
= 2
Maximum number of retries. Set to 0 to disable retries.

memoryOptions?:

MemoryConfig
**Deprecated.** Use `memory.options` instead. Configuration options for memory management.

lastMessages?:

number | false
Number of recent messages to include in context, or false to disable.

semanticRecall?:

boolean | { topK: number; messageRange: number | { before: number; after: number }; scope?: 'thread' | 'resource' }
Enable semantic recall to find relevant past messages. Can be a boolean or detailed configuration.

workingMemory?:

WorkingMemory
Configuration for working memory functionality.

threads?:

{ generateTitle?: boolean | { model: DynamicArgument<MastraLanguageModel>; instructions?: DynamicArgument<string> } }
Thread-specific configuration, including automatic title generation.

onFinish?:

StreamTextOnFinishCallback<any> | StreamObjectOnFinishCallback<OUTPUT>
Callback function called when streaming completes. Receives the final result.

onStepFinish?:

StreamTextOnStepFinishCallback<any> | never
Callback function called after each execution step. Receives step details as a JSON string. Unavailable for structured output

resourceId?:

string
**Deprecated.** Use `memory.resource` instead. Identifier for the user or resource interacting with the agent. Must be provided if threadId is provided.

telemetry?:

TelemetrySettings
Settings for telemetry collection during streaming.

isEnabled?:

boolean
Enable or disable telemetry. Disabled by default while experimental.

recordInputs?:

boolean
Enable or disable input recording. Enabled by default. You might want to disable input recording to avoid recording sensitive information.

recordOutputs?:

boolean
Enable or disable output recording. Enabled by default. You might want to disable output recording to avoid recording sensitive information.

functionId?:

string
Identifier for this function. Used to group telemetry data by function.

temperature?:

number
Controls randomness in the model's output. Higher values (e.g., 0.8) make the output more random, lower values (e.g., 0.2) make it more focused and deterministic.

threadId?:

string
**Deprecated.** Use `memory.thread` instead. Identifier for the conversation thread. Allows for maintaining context across multiple interactions. Must be provided if resourceId is provided.

toolChoice?:

'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }
= 'auto'
Controls how the agent uses tools during streaming.

'auto':

string
Let the model decide whether to use tools (default).

'none':

string
Do not use any tools.

'required':

string
Require the model to use at least one tool.

{ type: 'tool'; toolName: string }:

object
Require the model to use a specific tool by name.

toolsets?:

ToolsetsInput
Additional toolsets to make available to the agent during streaming.

clientTools?:

ToolsInput
Tools that are executed on the 'client' side of the request. These tools do not have execute functions in the definition.

savePerStep?:

boolean
Save messages incrementally after each stream step completes (default: false).

providerOptions?:

Record<string, Record<string, JSONValue>>
Additional provider-specific options that are passed through to the underlying LLM provider. The structure is `{ providerName: { optionKey: value } }`. For example: `{ openai: { reasoningEffort: 'high' }, anthropic: { maxTokens: 1000 } }`.

openai?:

Record<string, JSONValue>
OpenAI-specific options. Example: `{ reasoningEffort: 'high' }`

anthropic?:

Record<string, JSONValue>
Anthropic-specific options. Example: `{ maxTokens: 1000 }`

google?:

Record<string, JSONValue>
Google-specific options. Example: `{ safetySettings: [...] }`

[providerName]?:

Record<string, JSONValue>
Other provider-specific options. The key is the provider name and the value is a record of provider-specific options.

runId?:

string
Unique ID for this generation run. Useful for tracking and debugging purposes.

runtimeContext?:

RuntimeContext
Runtime context for dependency injection and contextual information.

maxTokens?:

number
Maximum number of tokens to generate.

topP?:

number
Nucleus sampling. This is a number between 0 and 1. It is recommended to set either `temperature` or `topP`, but not both.

topK?:

number
Only sample from the top K options for each subsequent token. Used to remove 'long tail' low probability responses.

presencePenalty?:

number
Presence penalty setting. It affects the likelihood of the model to repeat information that is already in the prompt. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).

frequencyPenalty?:

number
Frequency penalty setting. It affects the likelihood of the model to repeatedly use the same words or phrases. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).

stopSequences?:

string[]
Stop sequences. If set, the model will stop generating text when one of the stop sequences is generated.

seed?:

number
The seed (integer) to use for random sampling. If set and supported by the model, calls will generate deterministic results.

headers?:

Record<string, string | undefined>
Additional HTTP headers to be sent with the request. Only applicable for HTTP-based providers.

Returns

textStream?:

AsyncGenerator<string>
Async generator that yields text chunks as they become available.

fullStream?:

Promise<ReadableStream>
Promise that resolves to a ReadableStream for the complete response.

text?:

Promise<string>
Promise that resolves to the complete text response.

usage?:

Promise<{ totalTokens: number; promptTokens: number; completionTokens: number }>
Promise that resolves to token usage information.

finishReason?:

Promise<string>
Promise that resolves to the reason why the stream finished.

toolCalls?:

Promise<Array<ToolCall>>
Promise that resolves to the tool calls made during the streaming process.

toolName:

string
The name of the tool invoked.

args:

any
The arguments passed to the tool.

Extended usage example

await agent.stream("message for agent", { temperature: 0.7, maxSteps: 3, memory: { thread: "user-123", resource: "test-app" }, toolChoice: "auto" });