Skip to main content
Mastra 1.0 is available 🎉 Read announcement

Agent.generate()

The .generate() method enables non-streaming response generation from an agent with enhanced capabilities. It accepts messages and optional generation options.

Usage example
Direct link to Usage example

// Basic usage
const result = await agent.generate("message for agent");

// With model settings (e.g., limiting output tokens)
const limitedResult = await agent.generate("Write a short poem about coding", {
modelSettings: {
maxOutputTokens: 50,
temperature: 0.7,
},
});

// With structured output
const structuredResult = await agent.generate("Extract the user's name and age", {
structuredOutput: {
schema: z.object({
name: z.string(),
age: z.number(),
}),
},
});

// With memory for conversation persistence
const memoryResult = await agent.generate("Remember my favorite color is blue", {
memory: {
thread: "user-123-thread",
resource: "user-123",
},
});

// Accessing response headers
const result = await agent.generate("Hello!");
const remainingRequests = result.response?.headers?.["anthropic-ratelimit-requests-remaining"];
const remainingTokens = result.response?.headers?.["x-ratelimit-remaining-tokens"];
console.log(`Remaining requests: ${remainingRequests}, Remaining tokens: ${remainingTokens}`);
info

Model Compatibility: This method requires AI SDK v5+ models. If you're using AI SDK v4 models, use the .generateLegacy() method instead. The framework automatically detects your model version and will throw an error if there's a mismatch.

Parameters
Direct link to Parameters

messages:

string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]
The messages to send to the agent. Can be a single string, array of strings, or structured message objects.

options?:

AgentExecutionOptions<Output, Format>
Optional configuration for the generation process.

Options
Direct link to Options

maxSteps?:

number
Maximum number of steps to run during execution.

stopWhen?:

LoopOptions['stopWhen']
Conditions for stopping execution (e.g., step count, token limit).

scorers?:

MastraScorers | Record<string, { scorer: MastraScorer['name']; sampling?: ScoringSamplingConfig }>
Evaluation scorers to run on the execution results.

scorer:

string
Name of the scorer to use.

sampling?:

ScoringSamplingConfig
Sampling configuration for the scorer.

returnScorerData?:

boolean
Whether to return detailed scoring data in the response.

onChunk?:

(chunk: ChunkType) => Promise<void> | void
Callback function called for each chunk during generation.

onError?:

({ error }: { error: Error | string }) => Promise<void> | void
Callback function called when an error occurs during generation.

onAbort?:

(event: any) => Promise<void> | void
Callback function called when the generation is aborted.

activeTools?:

Array<keyof ToolSet> | undefined
Array of tool names that should be active during execution. If undefined, all available tools are active.

abortSignal?:

AbortSignal
Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.

prepareStep?:

PrepareStepFunction
Callback function called before each step of multi-step execution.

requireToolApproval?:

boolean
When true, all tool calls require explicit approval before execution. The generate() method will return with `finishReason: 'suspended'` and include a `suspendPayload` with tool call details (`toolCallId`, `toolName`, `args`). Use `approveToolCallGenerate()` or `declineToolCallGenerate()` to proceed. See [Agent Approval](/docs/agents/agent-approval#tool-approval-with-generate) for details.

autoResumeSuspendedTools?:

boolean
When true, automatically resumes suspended tools when the user sends a new message on the same thread. The agent extracts `resumeData` from the user's message based on the tool's `resumeSchema`. Requires memory to be configured.

toolCallConcurrency?:

number
Maximum number of tool calls to execute concurrently. Defaults to 1 when approval may be required, otherwise 10.

context?:

ModelMessage[]
Additional context messages to provide to the agent.

structuredOutput?:

StructuredOutputOptions<S extends ZodTypeAny = ZodTypeAny>
Options to fine tune your structured output generation.

schema:

z.ZodSchema<S>
Zod schema defining the expected output structure.

model?:

MastraLanguageModel
Language model to use for structured output generation. If provided, enables the agent to respond in multi step with tool calls, text, and structured output

errorStrategy?:

'strict' | 'warn' | 'fallback'
Strategy for handling schema validation errors. 'strict' throws errors, 'warn' logs warnings, 'fallback' uses fallback values.

fallbackValue?:

<S extends ZodTypeAny>
Fallback value to use when schema validation fails and errorStrategy is 'fallback'.

instructions?:

string
Additional instructions for the structured output model.

jsonPromptInjection?:

boolean
Injects system prompt into the main agent instructing it to return structured output, useful for when a model does not natively support structured outputs.

logger?:

IMastraLogger
Optional logger instance for structured logging during output generation.

providerOptions?:

ProviderOptions
Provider-specific options passed to the internal structuring agent. Use this to control model behavior like reasoning effort for thinking models (e.g., `{ openai: { reasoningEffort: 'low' } }`).

outputProcessors?:

OutputProcessorOrWorkflow[]
Output processors to use for this execution (overrides agent's default).

maxProcessorRetries?:

number
Maximum number of times processors can trigger a retry for this generation. Overrides agent's default maxProcessorRetries.

inputProcessors?:

InputProcessorOrWorkflow[]
Input processors to use for this execution (overrides agent's default).

instructions?:

string | string[] | CoreSystemMessage | SystemModelMessage | CoreSystemMessage[] | SystemModelMessage[]
Custom instructions that override the agent's default instructions for this execution. Can be a single string, message object, or array of either.

system?:

string | string[] | CoreSystemMessage | SystemModelMessage | CoreSystemMessage[] | SystemModelMessage[]
Custom system message(s) to include in the prompt. Can be a single string, message object, or array of either. System messages provide additional context or behavior instructions that supplement the agent's main instructions.

output?:

Zod schema | JsonSchema7
**Deprecated.** Use structuredOutput without a model to achieve the same thing. Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.

memory?:

object
Memory configuration for conversation persistence and retrieval.

thread:

string | { id: string; metadata?: Record<string, any>, title?: string }
Thread identifier for conversation continuity. Can be a string ID or an object with ID and optional metadata/title.

resource:

string
Resource identifier for organizing conversations by user, session, or context.

options?:

MemoryConfig
Additional memory configuration options including lastMessages, readOnly, semanticRecall, and workingMemory.

onFinish?:

LoopConfig['onFinish']
Callback fired when generation completes.

onStepFinish?:

LoopConfig['onStepFinish']
Callback fired after each generation step.

telemetry?:

TelemetrySettings
Settings for OTLP telemetry collection during generation (not Tracing).

isEnabled?:

boolean
Whether telemetry collection is enabled.

recordInputs?:

boolean
Whether to record input data in telemetry.

recordOutputs?:

boolean
Whether to record output data in telemetry.

functionId?:

string
Identifier for the function being executed.

modelSettings?:

CallSettings
Model-specific settings like temperature, maxOutputTokens, topP, etc. These settings control how the language model generates responses.

temperature?:

number
Controls randomness in generation (0-2). Higher values make output more random.

maxOutputTokens?:

number
Maximum number of tokens to generate in the response. Note: Use maxOutputTokens (not maxTokens) as per AI SDK v5 convention.

maxRetries?:

number
Maximum number of retry attempts for failed requests.

topP?:

number
Nucleus sampling parameter (0-1). Controls diversity of generated text.

topK?:

number
Top-k sampling parameter. Limits vocabulary to k most likely tokens.

presencePenalty?:

number
Penalty for token presence (-2 to 2). Reduces repetition.

frequencyPenalty?:

number
Penalty for token frequency (-2 to 2). Reduces repetition of frequent tokens.

stopSequences?:

string[]
Stop sequences. If set, the model will stop generating text when one of the stop sequences is generated.

toolChoice?:

'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }
Controls how tools are selected during generation.

'auto':

string
Let the model decide when to use tools (default).

'none':

string
Disable tool usage entirely.

'required':

string
Force the model to use at least one tool.

{ type: 'tool'; toolName: string }:

object
Force the model to use a specific tool.

toolsets?:

ToolsetsInput
Additional tool sets that can be used for this execution.

clientTools?:

ToolsInput
Client-side tools available during execution.

savePerStep?:

boolean
Save messages incrementally after each generation step completes (default: false).

providerOptions?:

Record<string, Record<string, JSONValue>>
Provider-specific options passed to the language model.

openai?:

Record<string, JSONValue>
OpenAI-specific options like reasoningEffort, responseFormat, etc.

anthropic?:

Record<string, JSONValue>
Anthropic-specific options like maxTokens, etc.

google?:

Record<string, JSONValue>
Google-specific options.

[providerName]?:

Record<string, JSONValue>
Any provider-specific options.

runId?:

string
Unique identifier for this execution run.

requestContext?:

RequestContext
Request Context containing dynamic configuration and state.

tracingContext?:

TracingContext
Tracing context for creating child spans and adding metadata. Automatically injected when using Mastra's tracing system.

currentSpan?:

Span
Current span for creating child spans and adding metadata. Use this to create custom child spans or update span attributes during execution.

tracingOptions?:

TracingOptions
Options for Tracing configuration.

metadata?:

Record<string, any>
Metadata to add to the root trace span. Useful for adding custom attributes like user IDs, session IDs, or feature flags.

requestContextKeys?:

string[]
Additional RequestContext keys to extract as metadata for this trace. Supports dot notation for nested values (e.g., 'user.id').

traceId?:

string
Trace ID to use for this execution (1-32 hexadecimal characters). If provided, this trace will be part of the specified trace.

parentSpanId?:

string
Parent span ID to use for this execution (1-16 hexadecimal characters). If provided, the root span will be created as a child of this span.

tags?:

string[]
Tags to apply to this trace. String labels for categorizing and filtering traces.

includeRawChunks?:

boolean
Whether to include raw chunks in the stream output. Not available on all model providers.

Returns
Direct link to Returns

result:

Awaited<ReturnType<MastraModelOutput<Output>['getFullOutput']>>
Returns the full output of the generation process including text, object (if structured output), tool calls, tool results, usage statistics, and step information.

text:

string
The generated text response from the agent.

object?:

Output | undefined
The structured output object if structuredOutput was provided, validated against the schema.

toolCalls:

ToolCall[]
Array of tool calls made during generation.

toolResults:

ToolResult[]
Array of results from tool executions.

usage:

TokenUsage
Token usage statistics for the generation.

steps:

Step[]
Array of execution steps, useful for debugging multi-step generations.

finishReason:

string
The reason generation finished. Values include 'stop' (normal completion), 'tool-calls' (ended with tool calls), 'suspended' (waiting for tool approval), or 'error' (error occurred).

response:

object
Response metadata from the model provider. Useful for accessing rate limit headers and request IDs.

id?:

string
Response ID from the model provider.

timestamp?:

Date
Timestamp when the response was generated.

modelId?:

string
Model identifier used for this response.

headers?:

Record<string, string>
HTTP response headers from the model provider. Contains rate limit information (e.g., `anthropic-ratelimit-requests-remaining`, `x-ratelimit-remaining-tokens`) and other provider-specific metadata.

messages?:

ResponseMessage[]
Response messages in model format.

uiMessages?:

UIMessage[]
Response messages in UI format, includes any metadata added by output processors.

request?:

object
The request that was sent to the model.

body?:

unknown
The request body sent to the model provider.

warnings?:

LanguageModelWarning[]
Any warnings from the model provider during generation.

providerMetadata?:

Record<string, unknown>
Provider-specific metadata returned with the response.

reasoning?:

ReasoningChunk[]
Reasoning details from models that support reasoning (e.g., OpenAI o1 series).

reasoningText?:

string
Combined reasoning text from reasoning models.

sources?:

SourceChunk[]
Sources referenced by the model during generation.

files?:

FileChunk[]
Files generated by the model.

suspendPayload?:

object
Present when `finishReason` is 'suspended'. Contains tool call details needed to approve or decline the pending tool call.

toolCallId:

string
Unique identifier for the pending tool call.

toolName:

string
Name of the tool that requires approval.

args:

Record<string, any>
Arguments that will be passed to the tool.

runId?:

string
Unique identifier for this execution run. Required when calling `approveToolCallGenerate()` or `declineToolCallGenerate()` to resume a suspended execution.

traceId?:

string
The trace ID associated with this execution when Tracing is enabled. Use this to correlate logs and debug execution flow.

messages:

MastraDBMessage[]
All messages from this execution including input, memory history, and response.

rememberedMessages:

MastraDBMessage[]
Only messages loaded from memory (conversation history).

error?:

Error
Error object if the generation failed.

tripwire?:

StepTripwireData
Tripwire data if content was blocked by a processor.

scoringData?:

object
Scoring data for evals when `returnScorerData` is enabled.
On this page