Agent.generate()
The .generate() method enables non-streaming response generation from an agent with enhanced capabilities. It accepts messages and optional generation options.
Usage exampleDirect link to Usage example
// Basic usage
const result = await agent.generate('message for agent')
// With model settings (e.g., limiting output tokens)
const limitedResult = await agent.generate('Write a short poem about coding', {
modelSettings: {
maxOutputTokens: 50,
temperature: 0.7,
},
})
// With structured output
const structuredResult = await agent.generate("Extract the user's name and age", {
structuredOutput: {
schema: z.object({
name: z.string(),
age: z.number(),
}),
},
})
// With memory for conversation persistence
const memoryResult = await agent.generate('Remember my favorite color is blue', {
memory: {
thread: 'user-123-thread',
resource: 'user-123',
},
})
// Accessing response headers
const result = await agent.generate('Hello!')
const remainingRequests = result.response?.headers?.['anthropic-ratelimit-requests-remaining']
const remainingTokens = result.response?.headers?.['x-ratelimit-remaining-tokens']
console.log(`Remaining requests: ${remainingRequests}, Remaining tokens: ${remainingTokens}`)
info
Model Compatibility: This method requires AI SDK v5+ models. If you're using AI SDK v4 models, use the .generateLegacy() method instead. The framework automatically detects your model version and will throw an error if there's a mismatch.
ParametersDirect link to Parameters
messages:
string | string[] | CoreMessage[] | AiMessageType[] | UIMessageWithMetadata[]
The messages to send to the agent. Can be a single string, array of strings, or structured message objects.
options?:
AgentExecutionOptions<Output, Format>
Optional configuration for the generation process.
AgentExecutionOptions<Output, Format>
maxSteps?:
number
Maximum number of steps to run during execution.
stopWhen?:
LoopOptions['stopWhen']
Conditions for stopping execution (e.g., step count, token limit).
onIterationComplete?:
(context: IterationCompleteContext) => { continue?: boolean; feedback?: string } | void | Promise<{ continue?: boolean; feedback?: string } | void>
Callback function called after each iteration completes. Use this to monitor progress, provide feedback to guide the agent, or stop execution early. The callback receives context about the iteration including the current text, tool calls, and finish reason.
IterationCompleteContext
context.iteration:
number
Current iteration number (1-based).
context.maxIterations:
number | undefined
Maximum iterations allowed (if set).
context.text:
string
The text response from this iteration.
context.isFinal:
boolean
Whether this is the final iteration.
context.finishReason:
string
Reason why this iteration finished (e.g., 'stop', 'length', 'tool-calls').
context.toolCalls:
ToolCall[]
Tool calls made in this iteration.
context.messages:
MastraDBMessage[]
All messages accumulated so far.
return.continue?:
boolean
Set to false to stop execution early.
return.feedback?:
string
Feedback message to guide the agent's next iteration.
isTaskComplete?:
IsTaskCompleteConfig
Task completion scoring configuration that validates whether the task is complete. Uses Mastra's evaluation scorers to automatically check if the agent's response satisfies the completion criteria.
IsTaskCompleteConfig
scorers:
MastraScorer[]
Array of scorers that evaluate task completion. Each scorer returns 0 (failed) or 1 (passed).
strategy?:
'all' | 'any'
Strategy for combining scorer results. 'all' requires all scorers to pass, 'any' requires at least one.
onComplete?:
(result: IsTaskCompleteRunResult) => void | Promise<void>
Callback called when the task completion check finishes. Receives the result with individual scorer scores.
parallel?:
boolean
Whether to run scorers in parallel.
timeout?:
number
Maximum time in milliseconds to wait for all scorers to complete.
delegation?:
DelegationConfig
Configuration for subagent delegation. Use this to control and monitor when the agent delegates tasks to other agents, including the ability to modify, reject delegations, and provide feedback to guide the supervisor.
DelegationConfig
onDelegationStart?:
(context: DelegationStartContext) => DelegationStartResult | void | Promise<DelegationStartResult | void>
Called before delegating to a subagent. Use this to modify the delegation parameters or reject the delegation entirely.
onDelegationComplete?:
(context: DelegationCompleteContext) => { feedback?: string } | void | Promise<{ feedback?: string } | void>
Called after a subagent delegation completes. The context includes a `bail()` method to stop further execution, and you can return `{ feedback }` to guide the supervisor's next action. Feedback is saved to supervisor memory as an assistant message.
messageFilter?:
(context: MessageFilterContext) => MastraDBMessage[] | Promise<MastraDBMessage[]>
Callback function called before delegating to a subagent. Use this to filter the messages that are passed to the subagent.
scorers?:
MastraScorers | Record<string, { scorer: MastraScorer['name']; sampling?: ScoringSamplingConfig }>
Evaluation scorers to run on the execution results.
MastraScorers | Record<string, { scorer: MastraScorer['name']; sampling?: ScoringSamplingConfig }>
scorer:
string
Name of the scorer to use.
sampling?:
ScoringSamplingConfig
Sampling configuration for the scorer.
ScoringSamplingConfig
type:
'none' | 'ratio'
Type of sampling strategy. Use 'none' to disable sampling or 'ratio' for percentage-based sampling.
rate?:
number
Sampling rate (0-1). Required when type is 'ratio'.
returnScorerData?:
boolean
Whether to return detailed scoring data in the response.
onChunk?:
(chunk: ChunkType) => Promise<void> | void
Callback function called for each chunk during generation.
onError?:
({ error }: { error: Error | string }) => Promise<void> | void
Callback function called when an error occurs during generation.
onAbort?:
(event: any) => Promise<void> | void
Callback function called when the generation is aborted.
activeTools?:
Array<keyof ToolSet> | undefined
Array of tool names that should be active during execution. If undefined, all available tools are active.
abortSignal?:
AbortSignal
Signal object that allows you to abort the agent's execution. When the signal is aborted, all ongoing operations will be terminated.
prepareStep?:
PrepareStepFunction
Callback function called before each step of multi-step execution.
requireToolApproval?:
boolean
When true, all tool calls require explicit approval before execution. The generate() method will return with `finishReason: 'suspended'` and include a `suspendPayload` with tool call details (`toolCallId`, `toolName`, `args`). Use `approveToolCallGenerate()` or `declineToolCallGenerate()` to proceed. See [Agent Approval](/docs/agents/agent-approval#tool-approval-with-generate) for details.
autoResumeSuspendedTools?:
boolean
When true, automatically resumes suspended tools when the user sends a new message on the same thread. The agent extracts `resumeData` from the user's message based on the tool's `resumeSchema`. Requires memory to be configured.
toolCallConcurrency?:
number
Maximum number of tool calls to execute concurrently. Defaults to 1 when approval may be required, otherwise 10.
context?:
ModelMessage[]
Additional context messages to provide to the agent.
structuredOutput?:
StructuredOutputOptions<S extends ZodTypeAny = ZodTypeAny>
Options to fine tune your structured output generation.
StructuredOutputOptions<S extends ZodTypeAny = ZodTypeAny>
schema:
z.ZodSchema<S>
Zod schema defining the expected output structure.
model?:
MastraLanguageModel
Language model to use for structured output generation. If provided, enables the agent to respond in multi step with tool calls, text, and structured output
errorStrategy?:
'strict' | 'warn' | 'fallback'
Strategy for handling schema validation errors. 'strict' throws errors, 'warn' logs warnings, 'fallback' uses fallback values.
fallbackValue?:
<S extends ZodTypeAny>
Fallback value to use when schema validation fails and errorStrategy is 'fallback'.
instructions?:
string
Additional instructions for the structured output model.
jsonPromptInjection?:
boolean
Injects system prompt into the main agent instructing it to return structured output, useful for when a model does not natively support structured outputs.
logger?:
IMastraLogger
Optional logger instance for structured logging during output generation.
providerOptions?:
ProviderOptions
Provider-specific options passed to the internal structuring agent. Use this to control model behavior like reasoning effort for thinking models (e.g., `{ openai: { reasoningEffort: 'low' } }`).
outputProcessors?:
OutputProcessorOrWorkflow[]
Output processors to use for this execution (overrides agent's default).
maxProcessorRetries?:
number
Maximum number of times processors can trigger a retry for this generation. Overrides agent's default maxProcessorRetries.
inputProcessors?:
InputProcessorOrWorkflow[]
Input processors to use for this execution (overrides agent's default).
instructions?:
string | string[] | CoreSystemMessage | SystemModelMessage | CoreSystemMessage[] | SystemModelMessage[]
Custom instructions that override the agent's default instructions for this execution. Can be a single string, message object, or array of either.
system?:
string | string[] | CoreSystemMessage | SystemModelMessage | CoreSystemMessage[] | SystemModelMessage[]
Custom system message(s) to include in the prompt. Can be a single string, message object, or array of either. System messages provide additional context or behavior instructions that supplement the agent's main instructions.
output?:
Zod schema | JsonSchema7
**Deprecated.** Use structuredOutput without a model to achieve the same thing. Defines the expected structure of the output. Can be a JSON Schema object or a Zod schema.
memory?:
object
Memory configuration for conversation persistence and retrieval.
object
thread:
string | { id: string; metadata?: Record<string, any>, title?: string }
Thread identifier for conversation continuity. Can be a string ID or an object with ID and optional metadata/title.
resource:
string
Resource identifier for organizing conversations by user, session, or context.
options?:
MemoryConfig
Additional memory configuration options including lastMessages, readOnly, semanticRecall, and workingMemory.
onFinish?:
LoopConfig['onFinish']
Callback fired when generation completes.
onStepFinish?:
LoopConfig['onStepFinish']
Callback fired after each generation step.
telemetry?:
TelemetrySettings
Settings for OTLP telemetry collection during generation (not Tracing).
TelemetrySettings
isEnabled?:
boolean
Whether telemetry collection is enabled.
recordInputs?:
boolean
Whether to record input data in telemetry.
recordOutputs?:
boolean
Whether to record output data in telemetry.
functionId?:
string
Identifier for the function being executed.
modelSettings?:
CallSettings
Model-specific settings like temperature, maxOutputTokens, topP, etc. These settings control how the language model generates responses.
temperature?:
number
Controls randomness in generation (0-2). Higher values make output more random.
maxOutputTokens?:
number
Maximum number of tokens to generate in the response. Note: Use maxOutputTokens (not maxTokens) as per AI SDK v5 convention.
maxRetries?:
number
Maximum number of retry attempts for failed requests.
topP?:
number
Nucleus sampling parameter (0-1). Controls diversity of generated text.
topK?:
number
Top-k sampling parameter. Limits vocabulary to k most likely tokens.
presencePenalty?:
number
Penalty for token presence (-2 to 2). Reduces repetition.
frequencyPenalty?:
number
Penalty for token frequency (-2 to 2). Reduces repetition of frequent tokens.
stopSequences?:
string[]
Stop sequences. If set, the model will stop generating text when one of the stop sequences is generated.
toolChoice?:
'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }
Controls how tools are selected during generation.
'auto' | 'none' | 'required' | { type: 'tool'; toolName: string }
'auto':
string
Let the model decide when to use tools (default).
'none':
string
Disable tool usage entirely.
'required':
string
Force the model to use at least one tool.
{ type: 'tool'; toolName: string }:
object
Force the model to use a specific tool.
toolsets?:
ToolsetsInput
Additional tool sets that can be used for this execution.
clientTools?:
ToolsInput
Client-side tools available during execution.
savePerStep?:
boolean
Save messages incrementally after each generation step completes (default: false).
providerOptions?:
Record<string, Record<string, JSONValue>>
Provider-specific options passed to the language model.
Record<string, Record<string, JSONValue>>
openai?:
Record<string, JSONValue>
OpenAI-specific options like reasoningEffort, responseFormat, etc.
anthropic?:
Record<string, JSONValue>
Anthropic-specific options like maxTokens, etc.
google?:
Record<string, JSONValue>
Google-specific options.
[providerName]?:
Record<string, JSONValue>
Any provider-specific options.
runId?:
string
Unique identifier for this execution run.
requestContext?:
RequestContext
Request Context containing dynamic configuration and state.
tracingContext?:
TracingContext
Tracing context for creating child spans and adding metadata. Automatically injected when using Mastra's tracing system.
TracingContext
currentSpan?:
Span
Current span for creating child spans and adding metadata. Use this to create custom child spans or update span attributes during execution.
tracingOptions?:
TracingOptions
Options for Tracing configuration.
TracingOptions
metadata?:
Record<string, any>
Metadata to add to the root trace span. Useful for adding custom attributes like user IDs, session IDs, or feature flags.
requestContextKeys?:
string[]
Additional RequestContext keys to extract as metadata for this trace. Supports dot notation for nested values (e.g., 'user.id').
traceId?:
string
Trace ID to use for this execution (1-32 hexadecimal characters). If provided, this trace will be part of the specified trace.
parentSpanId?:
string
Parent span ID to use for this execution (1-16 hexadecimal characters). If provided, the root span will be created as a child of this span.
tags?:
string[]
Tags to apply to this trace. String labels for categorizing and filtering traces.
includeRawChunks?:
boolean
Whether to include raw chunks in the stream output. Not available on all model providers.
ReturnsDirect link to Returns
result:
Awaited<ReturnType<MastraModelOutput<Output>['getFullOutput']>>
Returns the full output of the generation process including text, object (if structured output), tool calls, tool results, usage statistics, and step information.
text:
string
The generated text response from the agent.
object?:
Output | undefined
The structured output object if structuredOutput was provided, validated against the schema.
toolCalls:
ToolCall[]
Array of tool calls made during generation.
toolResults:
ToolResult[]
Array of results from tool executions.
usage:
TokenUsage
Token usage statistics for the generation.
steps:
Step[]
Array of execution steps, useful for debugging multi-step generations.
finishReason:
string
The reason generation finished. Values include 'stop' (normal completion), 'tool-calls' (ended with tool calls), 'suspended' (waiting for tool approval), or 'error' (error occurred).
response:
object
Response metadata from the model provider. Useful for accessing rate limit headers and request IDs.
object
id?:
string
Response ID from the model provider.
timestamp?:
Date
Timestamp when the response was generated.
modelId?:
string
Model identifier used for this response.
headers?:
Record<string, string>
HTTP response headers from the model provider. Contains rate limit information (e.g., `anthropic-ratelimit-requests-remaining`, `x-ratelimit-remaining-tokens`) and other provider-specific metadata.
messages?:
ResponseMessage[]
Response messages in model format.
uiMessages?:
UIMessage[]
Response messages in UI format, includes any metadata added by output processors.
request?:
object
The request that was sent to the model.
object
body?:
unknown
The request body sent to the model provider.
warnings?:
LanguageModelWarning[]
Any warnings from the model provider during generation.
providerMetadata?:
Record<string, unknown>
Provider-specific metadata returned with the response.
reasoning?:
ReasoningChunk[]
Reasoning details from models that support reasoning (e.g., OpenAI o1 series).
reasoningText?:
string
Combined reasoning text from reasoning models.
sources?:
SourceChunk[]
Sources referenced by the model during generation.
files?:
FileChunk[]
Files generated by the model.
suspendPayload?:
object
Present when `finishReason` is 'suspended'. Contains tool call details needed to approve or decline the pending tool call.
object
toolCallId:
string
Unique identifier for the pending tool call.
toolName:
string
Name of the tool that requires approval.
args:
Record<string, any>
Arguments that will be passed to the tool.
runId?:
string
Unique identifier for this execution run. Required when calling `approveToolCallGenerate()` or `declineToolCallGenerate()` to resume a suspended execution.
traceId?:
string
The trace ID associated with this execution when Tracing is enabled. Use this to correlate logs and debug execution flow.
messages:
MastraDBMessage[]
All messages from this execution including input, memory history, and response.
rememberedMessages:
MastraDBMessage[]
Only messages loaded from memory (conversation history).
error?:
Error
Error object if the generation failed.
tripwire?:
StepTripwireData
Tripwire data if content was blocked by a processor.
scoringData?:
object
Scoring data for evals when `returnScorerData` is enabled.
RelatedDirect link to Related
- Agent Networks - Using the supervisor pattern for multi-agent coordination
- Migration: .network() to Supervisor Pattern
- Guide: Research Coordinator