`stream()`

The stream() method enables real-time streaming of responses from an agent. This method accepts messages and an optional options object as parameters, similar to generate().

Parameters

`messages`

The messages parameter can be:

A single string
An array of strings
An array of message objects with role and content properties

The message object structure:


interface Message {
  role: 'system' | 'user' | 'assistant';
  content: string;
}

`options` (Optional)

An optional object that can include configuration for output structure, memory management, tool usage, telemetry, and more.

abortSignal?:

AbortSignal

Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.

context?:

CoreMessage[]

Additional context messages to provide to the agent.

experimental_output?:

Zod schema | JsonSchema7

Enables structured output generation alongside text generation and tool calls. The model will generate responses that conform to the provided schema.

instructions?:

string

Custom instructions that override the agent's default instructions for this specific generation. Useful for dynamically modifying agent behavior without creating a new agent instance.

maxSteps?:

number

= 5

Maximum number of steps allowed during streaming.

maxRetries?:

number

= 2

Maximum number of retries. Set to 0 to disable retries.

memoryOptions?:

MemoryConfig

Configuration options for memory management. See MemoryConfig section below for details.

onFinish?:

StreamTextOnFinishCallback | StreamObjectOnFinishCallback

Callback function called when streaming is complete.

onStepFinish?:

GenerateTextOnStepFinishCallback<any> | never

Callback function called after each step during streaming. Unavailable for structured output

output?:

Zod schema | JsonSchema7

Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.

resourceId?:

string

Identifier for the user or resource interacting with the agent. Must be provided if threadId is provided.

telemetry?:

TelemetrySettings

Settings for telemetry collection during streaming. See TelemetrySettings section below for details.

temperature?:

number

Controls randomness in the model's output. Higher values (e.g., 0.8) make the output more random, lower values (e.g., 0.2) make it more focused and deterministic.

threadId?:

string

Identifier for the conversation thread. Allows for maintaining context across multiple interactions. Must be provided if resourceId is provided.

toolChoice?:

'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }

= 'auto'

Controls how the agent uses tools during streaming.

toolsets?:

ToolsetsInput

Additional toolsets to make available to the agent during this stream.

MemoryConfig

Configuration options for memory management:

lastMessages?:

number | false

Number of most recent messages to include in context. Set to false to disable.

semanticRecall?:

boolean | object

Configuration for semantic memory recall. Can be boolean or detailed config.

number

topK?:

number

Number of most semantically similar messages to retrieve.

number | object

messageRange?:

number | { before: number; after: number }

Range of messages to consider for semantic search. Can be a single number or before/after configuration.

workingMemory?:

object

Configuration for working memory.

boolean

enabled?:

boolean

Whether to enable working memory.

string

template?:

string

Template to use for working memory.

'text-stream' | 'tool-call'

type?:

'text-stream' | 'tool-call'

Type of content to use for working memory.

threads?:

object

Thread-specific memory configuration.

boolean

generateTitle?:

boolean

Whether to automatically generate titles for new threads.

TelemetrySettings

Settings for telemetry collection during streaming:

isEnabled?:

boolean

= false

Enable or disable telemetry. Disabled by default while experimental.

recordInputs?:

boolean

= true

Enable or disable input recording. You might want to disable this to avoid recording sensitive information, reduce data transfers, or increase performance.

recordOutputs?:

boolean

= true

Enable or disable output recording. You might want to disable this to avoid recording sensitive information, reduce data transfers, or increase performance.

functionId?:

string

Identifier for this function. Used to group telemetry data by function.

metadata?:

Record<string, AttributeValue>

Additional information to include in the telemetry data. AttributeValue can be string, number, boolean, array of these types, or null.

tracer?:

Tracer

A custom OpenTelemetry tracer instance to use for the telemetry data. See OpenTelemetry documentation for details.

Returns

The return value of the stream() method depends on the options provided, specifically the output option.

PropertiesTable for Return Values

textStream?:

AsyncIterable<string>

Stream of text chunks. Present when output is 'text' (no schema provided) or when using `experimental_output`.

objectStream?:

AsyncIterable<object>

Stream of structured data. Present only when using `output` option with a schema.

partialObjectStream?:

AsyncIterable<object>

Stream of structured data. Present only when using `experimental_output` option.

object?:

Promise<object>

Promise that resolves to the final structured output. Present when using either `output` or `experimental_output` options.

Examples

Basic Text Streaming


const stream = await myAgent.stream([
  { role: "user", content: "Tell me a story." }
]);
 
for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}

Structured Output Streaming with Thread Context


const schema = {
  type: 'object',
  properties: {
    summary: { type: 'string' },
    nextSteps: { type: 'array', items: { type: 'string' } }
  },
  required: ['summary', 'nextSteps']
};
 
const response = await myAgent.stream(
  "What should we do next?",
  {
    output: schema,
    threadId: "project-123",
    onFinish: text => console.log("Finished:", text)
  }
);
 
for await (const chunk of response.textStream) {
  console.log(chunk);
}
 
const result = await response.object;
console.log("Final structured result:", result);

The key difference between Agent’s stream() and LLM’s stream() is that Agents maintain conversation context through threadId, can access tools, and integrate with the agent’s memory system.

stream()

Parameters

messages

options (Optional)

abortSignal?:

context?:

experimental_output?:

instructions?:

maxSteps?:

maxRetries?:

memoryOptions?:

onFinish?:

onStepFinish?:

output?:

resourceId?:

telemetry?:

temperature?:

threadId?:

toolChoice?:

toolsets?:

MemoryConfig

lastMessages?:

semanticRecall?:

topK?:

messageRange?:

workingMemory?:

enabled?:

template?:

type?:

threads?:

generateTitle?:

TelemetrySettings

isEnabled?:

recordInputs?:

recordOutputs?:

functionId?:

metadata?:

tracer?:

Returns

PropertiesTable for Return Values

textStream?:

objectStream?:

partialObjectStream?:

object?:

Examples

Basic Text Streaming

Structured Output Streaming with Thread Context

`stream()`

`messages`

`options` (Optional)