Mastra Changelog 2026-03-11

We’ve been busy smoothing out a bunch of sharp edges in agent routing, schema compatibility, server ergonomics, and storage performance. If you run Mastra in production, you should feel this one immediately, both in flexibility and in latency.

Release: @mastra/core@1.11.0

We prepared automated codemods for most breaking changes. Run all v1 codemods at once:

npx @mastra/codemod@latest v1

See the migration guide for detailed instructions.

Let’s dive in:

Dynamic Model Fallback Arrays (Runtime Routing)

Agents already supported static fallbacks and dynamic model selection, but now you can return a full fallback array from a model function. That means routing decisions can be made at runtime with the same fidelity you’d normally hardcode, including per-model retry configs, tier-based lineups, region-specific variants, and even nested dynamic model selection.

This is especially useful when you want a single agent definition that adapts to the request, for example premium users get stronger models first, EU traffic goes to EU deployments first, or enterprise traffic gets higher retry budgets.

Here’s a basic tier-based example:

const agent = new Agent({
  model: ({ requestContext }) => {
    const tier = requestContext.get('tier');
 
    if (tier === 'premium') {
      return [
        { model: 'openai/gpt-4', maxRetries: 2 },
        { model: 'anthropic/claude-3-opus', maxRetries: 1 },
      ];
    }
 
    return [{ model: 'openai/gpt-3.5-turbo', maxRetries: 1 }];
  },
});

You can also mix static fallbacks with nested dynamics, where each element in the returned array can itself be dynamic:

const agent = new Agent({
  model: ({ requestContext }) => {
    const region = requestContext.get('region');
 
    return [
      {
        model: ({ requestContext }) => {
          return region === 'eu' ? 'openai/gpt-4-eu' : 'openai/gpt-4';
        },
        maxRetries: 2,
      },
      { model: 'anthropic/claude-3-opus', maxRetries: 1 },
    ];
  },
  maxRetries: 1, // default for models that do not specify maxRetries
});

Async selection works too, which is handy when routing depends on a database lookup or feature flag evaluation:

const agent = new Agent({
  model: async ({ requestContext }) => {
    const userId = requestContext.get('userId');
    const user = await db.users.findById(userId);
 
    if (user.tier === 'enterprise') {
      return [
        { model: 'openai/gpt-4', maxRetries: 3 },
        { model: 'anthropic/claude-3-opus', maxRetries: 2 },
      ];
    }
 
    return [{ model: 'openai/gpt-3.5-turbo', maxRetries: 1 }];
  },
});

Under the hood, arrays are normalized into the internal fallback representation, empty arrays are rejected early (so you fail fast rather than at generation time), and models without an explicit maxRetries correctly inherit the agent-level maxRetries.

If you want to read more about configuring fallbacks, check the Models docs (see “Model fallbacks”) in the Mastra documentation: https://mastra.ai/docs/ (navigate to Models).

(PR #11975)

Standard Schema + Zod v4 Compatibility Layer

Schema handling gets complicated fast when you accept multiple schema types across tool inputs, structured output, provider strict modes, and different Zod versions. Mastra now normalizes schemas through a Standard Schema layer, so you can pass Zod v3, Zod v4, AI SDK Schema, or JSON Schema and get consistent behavior.

The key pieces are:

toStandardSchema() to normalize supported schema inputs into a shared representation
standardSchemaToJSONSchema() to convert reliably to JSON Schema (particularly important for strict-mode providers)

This helps eliminate “works in one provider, fails in another” class issues, and it improves compatibility with strict schema validation modes, including fixes related to required arrays and optional/default/nullish fields.

Mastra exposes this via the core schema module that re-exports from @mastra/schema-compat. If you’re building integrations that need to normalize incoming schemas or produce JSON Schema for providers, this gives you a single path to follow.

(PR #12238)

Customizable Request Validation Errors Across All Server Adapters

When Zod validation fails, the default behavior is a simple 400. That is fine for internal services, but many teams want richer error bodies, different status codes (often 422), or a consistent envelope shared across all APIs.

There’s a new onValidationError hook available both globally on ServerConfig and per-route via createRoute(). It works across all supported server adapters (Hono, Express, Fastify, Koa), so you can standardize error responses without special-casing each runtime.

Global example:

const mastra = new Mastra({
  server: {
    onValidationError: (error, context) => ({
      status: 422,
      body: {
        ok: false,
        errors: error.issues.map((i) => ({
          path: i.path.join('.'),
          message: i.message,
        })),
        source: context,
      },
    }),
  },
});

This makes it easier to:

keep API responses consistent across services
return client-friendly validation errors with structured paths
attach request metadata about where validation failed (query, params, body)

(PR #13477)

RequestContext End-to-End (Tracing + Datasets/Experiments + Storage)

RequestContext is the backbone for request-scoped metadata like tenant IDs, user IDs, regions, and feature flags. In this release, it becomes a first-class signal throughout the full lifecycle: execution, tracing, persistence, and evaluation.

RequestContext on tracing spans

Tracing spans now capture a snapshot of the active RequestContext, so when you inspect traces you can see the request-scoped values that shaped the run.

This context is also persisted to span tables in supported stores (ClickHouse, Postgres, LibSQL, MSSQL), which is critical if you use request attributes to filter traces, correlate incidents, or segment performance.

(PR #14020)

RequestContext on datasets and experiments

Datasets and experiments now support requestContext, which unlocks much more realistic evals:

dataset items can store the context that existed when the input was captured
datasets can declare a requestContextSchema describing the expected shape
experiments accept a global requestContext that is forwarded to agent.generate()
per-item request context merges with and overrides experiment-level context

Example:

// Add item with request context
await dataset.addItem({
  input: messages,
  groundTruth: expectedOutput,
  requestContext: { userId: '123', locale: 'en' },
});
 
// Run experiment with global request context
await runExperiment(mastra, {
  datasetId: 'my-dataset',
  targetType: 'agent',
  targetId: 'my-agent',
  requestContext: { environment: 'staging' },
});

If you’re driving datasets via the API, @mastra/client-js also supports this end-to-end:

// Create a dataset with a request context schema
await client.createDataset({
  name: 'my-dataset',
  requestContextSchema: {
    type: 'object',
    properties: { region: { type: 'string' } },
  },
});
 
// Add an item with request context
await client.addDatasetItem({
  datasetId: 'my-dataset',
  input: { prompt: 'Hello' },
  requestContext: { region: 'us-east-1' },
});
 
// Trigger an experiment with request context forwarded to agent
await client.triggerDatasetExperiment({
  datasetId: 'my-dataset',
  agentId: 'my-agent',
  requestContext: { region: 'eu-west-1' },
});

This makes it much easier to evaluate “the same agent under different tenants/flags/regions” without duplicating agents or hardcoding config into prompts.

(PR #13938)

Faster & More Flexible Storage: Recall Performance + PgVector Indexing + New Vector Types

If you’ve ever had a long-running thread and noticed recall time creeping up, this batch targets that directly. Semantic recall is now significantly faster across multiple storage adapters, and Postgres in particular sees huge wins at large thread sizes.

Semantic recall performance improvements across adapters

Recall no longer degrades as threads grow, because adapters avoid loading entire threads when only recalled messages and nearby context are required. In Postgres, recall that could take ~30 seconds on 7,000+ message threads now completes in under 500ms.

These improvements landed across Postgres, LibSQL, Cloudflare D1, ClickHouse, Cloudflare, Convex, MongoDB, DynamoDB, Lance, Upstash, and MSSQL adapters.

(PR #14022)

PgVector metadataIndexes for fast filtered queries

PgVector now supports metadataIndexes in createIndex(), which creates btree indexes for specific metadata fields. This is a big deal for workloads that filter vectors by metadata such as thread_id and resource_id, since it avoids sequential scans under load.

await pgVector.createIndex({
  indexName: 'my_vectors',
  dimension: 1536,
  metadataIndexes: ['thread_id', 'resource_id'],
});

Memory uses this to index memory_messages (where filtering is common), making recall and retrieval much more predictable at scale.

(PR #14034)

New pgvector vector types: bit and sparsevec

@mastra/pg now supports pgvector’s bit and sparsevec storage types, which opens the door to different retrieval strategies:

binary vectors with vectorType: 'bit' (great for fast similarity with hamming/jaccard)
sparse vectors with vectorType: 'sparsevec' (useful for BM25/TF-IDF style representations)

// Binary vectors for fast similarity search
await db.createIndex({
  indexName: 'my_binary_index',
  dimension: 128,
  metric: 'hamming', // or 'jaccard'
  vectorType: 'bit',
});
 
// Sparse vectors for BM25/TF-IDF representations
await db.createIndex({
  indexName: 'my_sparse_index',
  dimension: 500,
  metric: 'cosine',
  vectorType: 'sparsevec',
});

Note that this requires pgvector >= 0.7.0.

(PR #12815)

Breaking Changes

Minimum Zod version bumped (PR #12238): Mastra now requires Zod ^3.25.0 (if you stay on v3) or Zod ^4.0.0 (to use v4). To migrate, update your dependency and then fix any Zod v4 API differences in your codebase if you upgrade to v4. Common changes include:

z.record() now requires the 2-argument form (key schema + value schema)
ZodError.errors is now ZodError.issues

Example adjustment:

// Before (older Zod forms)
const schema = z.record(z.string());
 
// After (Zod v4 compatible)
const schema = z.record(z.string(), z.string());

Other Notable Updates

Tool execution tracing via MCP server: Tool calls executed through the MCP server now show up in the Observability UI traces (PR #12804)
Processors get resolved generation data: processOutputResult now receives result (usage, text, steps, finishReason) as an OutputResult, replacing the need to interpret raw stream chunks (PR #13810)
Output processors can intercept custom data chunks: Set processDataParts = true to inspect or block writer.custom() chunks during streaming (PR #13823)
Transient streaming chunks: Mark custom streamed chunks as transient: true to stream them live but skip database persistence, reducing storage bloat (PR #13869)
Active tool enforcement at execution time: Tools not in activeTools are now rejected with ToolNotFoundError, not just omitted from prompts (PR #13949)
Improved tool-call argument robustness: Added JSON repair for malformed tool call arguments and sanitization for internal token suffixes so arguments are not silently dropped (PR #14033), (PR #13400)
Model fallback retry behavior fixed: Non-retryable errors (401/403) are no longer retried, and retry layering no longer multiplies calls, per-model maxRetries now behaves as intended (PR #14039)
Server: experiment triggers fail fast on empty datasets: The API now returns HTTP 400 and marks experiments failed properly instead of leaving them pending (PR #14031)
Harness subagents tool visibility controls: HarnessSubagent can now restrict which workspace tools it can access via allowedWorkspaceTools (PR #13940)
ClickHouse/DB forward compatibility: Unknown columns are now dropped on insert/update to prevent SQL errors when code is ahead of migrations (PR #14021)
AI SDK stream options merge correctly: handleChatStream now merges providerOptions from params and defaults instead of replacing wholesale (PR #13820)

That’s all for @mastra/core@1.11.0!

Happy building! 🚀