Agent.stream()
The .stream()
method enables real-time streaming of responses from an agent. This method accepts messages and optional streaming options.
Usage example
await agent.stream("message for agent");
Parameters
messages:
string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]
The messages to send to the agent. Can be a single string, array of strings, or structured message objects.
options?:
AgentStreamOptions<OUTPUT, EXPERIMENTAL_OUTPUT>
Optional configuration for the streaming process.
Options parameters
abortSignal?:
AbortSignal
Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.
context?:
CoreMessage[]
Additional context messages to provide to the agent.
experimental_output?:
Zod schema | JsonSchema7
Enables structured output generation alongside text generation and tool calls. The model will generate responses that conform to the provided schema.
instructions?:
string
Custom instructions that override the agent's default instructions for this specific generation. Useful for dynamically modifying agent behavior without creating a new agent instance.
output?:
Zod schema | JsonSchema7
Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.
memory?:
object
Configuration for memory. This is the preferred way to manage memory.
thread:
string | { id: string; metadata?: Record<string, any>, title?: string }
The conversation thread, as a string ID or an object with an `id` and optional `metadata`.
resource:
string
Identifier for the user or resource associated with the thread.
options?:
MemoryConfig
Configuration for memory behavior, like message history and semantic recall.
maxSteps?:
number
= 5
Maximum number of execution steps allowed.
maxRetries?:
number
= 2
Maximum number of retries. Set to 0 to disable retries.
memoryOptions?:
MemoryConfig
**Deprecated.** Use `memory.options` instead. Configuration options for memory management.
lastMessages?:
number | false
Number of recent messages to include in context, or false to disable.
semanticRecall?:
boolean | { topK: number; messageRange: number | { before: number; after: number }; scope?: 'thread' | 'resource' }
Enable semantic recall to find relevant past messages. Can be a boolean or detailed configuration.
workingMemory?:
WorkingMemory
Configuration for working memory functionality.
threads?:
{ generateTitle?: boolean | { model: DynamicArgument<MastraLanguageModel>; instructions?: DynamicArgument<string> } }
Thread-specific configuration, including automatic title generation.
onFinish?:
StreamTextOnFinishCallback<any> | StreamObjectOnFinishCallback<OUTPUT>
Callback function called when streaming completes. Receives the final result.
onStepFinish?:
StreamTextOnStepFinishCallback<any> | never
Callback function called after each execution step. Receives step details as a JSON string. Unavailable for structured output
resourceId?:
string
**Deprecated.** Use `memory.resource` instead. Identifier for the user or resource interacting with the agent. Must be provided if threadId is provided.
telemetry?:
TelemetrySettings
Settings for telemetry collection during streaming.
isEnabled?:
boolean
Enable or disable telemetry. Disabled by default while experimental.
recordInputs?:
boolean
Enable or disable input recording. Enabled by default. You might want to disable input recording to avoid recording sensitive information.
recordOutputs?:
boolean
Enable or disable output recording. Enabled by default. You might want to disable output recording to avoid recording sensitive information.
functionId?:
string
Identifier for this function. Used to group telemetry data by function.
temperature?:
number
Controls randomness in the model's output. Higher values (e.g., 0.8) make the output more random, lower values (e.g., 0.2) make it more focused and deterministic.
threadId?:
string
**Deprecated.** Use `memory.thread` instead. Identifier for the conversation thread. Allows for maintaining context across multiple interactions. Must be provided if resourceId is provided.
toolChoice?:
'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }
= 'auto'
Controls how the agent uses tools during streaming.
'auto':
string
Let the model decide whether to use tools (default).
'none':
string
Do not use any tools.
'required':
string
Require the model to use at least one tool.
{ type: 'tool'; toolName: string }:
object
Require the model to use a specific tool by name.
toolsets?:
ToolsetsInput
Additional toolsets to make available to the agent during streaming.
clientTools?:
ToolsInput
Tools that are executed on the 'client' side of the request. These tools do not have execute functions in the definition.
savePerStep?:
boolean
Save messages incrementally after each stream step completes (default: false).
providerOptions?:
Record<string, Record<string, JSONValue>>
Additional provider-specific options that are passed through to the underlying LLM provider. The structure is `{ providerName: { optionKey: value } }`. For example: `{ openai: { reasoningEffort: 'high' }, anthropic: { maxTokens: 1000 } }`.
openai?:
Record<string, JSONValue>
OpenAI-specific options. Example: `{ reasoningEffort: 'high' }`
anthropic?:
Record<string, JSONValue>
Anthropic-specific options. Example: `{ maxTokens: 1000 }`
google?:
Record<string, JSONValue>
Google-specific options. Example: `{ safetySettings: [...] }`
[providerName]?:
Record<string, JSONValue>
Other provider-specific options. The key is the provider name and the value is a record of provider-specific options.
runId?:
string
Unique ID for this generation run. Useful for tracking and debugging purposes.
runtimeContext?:
RuntimeContext
Runtime context for dependency injection and contextual information.
maxTokens?:
number
Maximum number of tokens to generate.
topP?:
number
Nucleus sampling. This is a number between 0 and 1. It is recommended to set either `temperature` or `topP`, but not both.
topK?:
number
Only sample from the top K options for each subsequent token. Used to remove 'long tail' low probability responses.
presencePenalty?:
number
Presence penalty setting. It affects the likelihood of the model to repeat information that is already in the prompt. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).
frequencyPenalty?:
number
Frequency penalty setting. It affects the likelihood of the model to repeatedly use the same words or phrases. A number between -1 (increase repetition) and 1 (maximum penalty, decrease repetition).
stopSequences?:
string[]
Stop sequences. If set, the model will stop generating text when one of the stop sequences is generated.
seed?:
number
The seed (integer) to use for random sampling. If set and supported by the model, calls will generate deterministic results.
headers?:
Record<string, string | undefined>
Additional HTTP headers to be sent with the request. Only applicable for HTTP-based providers.
Returns
textStream?:
AsyncGenerator<string>
Async generator that yields text chunks as they become available.
fullStream?:
Promise<ReadableStream>
Promise that resolves to a ReadableStream for the complete response.
text?:
Promise<string>
Promise that resolves to the complete text response.
usage?:
Promise<{ totalTokens: number; promptTokens: number; completionTokens: number }>
Promise that resolves to token usage information.
finishReason?:
Promise<string>
Promise that resolves to the reason why the stream finished.
toolCalls?:
Promise<Array<ToolCall>>
Promise that resolves to the tool calls made during the streaming process.
toolName:
string
The name of the tool invoked.
args:
any
The arguments passed to the tool.
Extended usage example
await agent.stream("message for agent", {
temperature: 0.7,
maxSteps: 3,
memory: {
thread: "user-123",
resource: "test-app"
},
toolChoice: "auto"
});