Mastra Changelog 2026-03-05

Improve tool-call accuracy, MCP fetch hooks receive RequestContext, and reliability updates surface streaming errors, preserve history in stateless tool runs, and clean up orphaned vector embeddings.

Shane ThomasShane Thomas·

Mar 5, 2026

·

6 min read

We've been busy tightening up tool calling, MCP integrations, and a handful of reliability edges that show up when you run agents in production. If you care about fewer malformed tool calls, better auth forwarding, and more trustworthy streaming behavior, this one is for you.

Release: @mastra/core@1.10.0

Let's dive in:

Tool inputExamples to improve model tool-call accuracy

You can now attach concrete input examples to your tool definitions via inputExamples. Mastra passes these examples through to models that support them (for example, Anthropic’s input_examples) alongside your tool schema, helping the model learn what “valid” looks like.

In practice, this reduces the classic tool calling issues, missing required keys, wrong enum values, or inputs shaped slightly differently than your Zod schema expects.

Here’s what it looks like:

 1import { createTool } from "@mastra/core";
 2import { z } from "zod";
 3
 4const weatherTool = createTool({
 5  id: "get-weather",
 6  description: "Get weather for a location",
 7  inputSchema: z.object({
 8    city: z.string(),
 9    units: z.enum(["celsius", "fahrenheit"])
10  }),
11  inputExamples: [{ input: { city: "New York", units: "fahrenheit" } }, { input: { city: "Tokyo", units: "celsius" } }],
12  execute: async ({ city, units }) => {
13    return await fetchWeather(city, units);
14  }
15});

Under the hood, inputExamples is now available on ToolAction, CoreTool, and the Tool class, so you can use it consistently whether you define tools via helpers or classes. (PR #12932)

MCP client fetch hooks now receive RequestContext (auth/cookie forwarding)

When you define an MCP HTTP server with a custom fetch, you can now receive the current RequestContext as an optional third argument. That lets you forward request-scoped details (cookies, bearer tokens, correlation ids) from the inbound Mastra request to the remote MCP server during tool execution.

This is especially useful when your MCP server sits behind auth, and your agent tools need to act “as the user” without inventing a separate credential plumbing system.

Example:

 1import { MCPClient } from "@mastra/mcp";
 2
 3const mcp = new MCPClient({
 4  servers: {
 5    myServer: {
 6      url: new URL("https://api.example.com/mcp"),
 7      fetch: async (url, init, requestContext) => {
 8        const headers = new Headers(init?.headers);
 9
10        const cookie = requestContext?.get("cookie");
11        if (cookie) headers.set("cookie", cookie);
12
13        const auth = requestContext?.get("authorization");
14        if (auth) headers.set("authorization", auth);
15
16        return fetch(url, { ...init, headers });
17      }
18    }
19  }
20});

Good to know, this stays backward compatible. Existing custom fetch implementations that only accept (url, init) continue working unchanged, so you can adopt this incrementally. (PR #13773)

Reliability + DX fixes for agents, streaming, and memory cleanup

A lot of the value in this release is in the “it just behaves correctly now” category, especially around streaming failures, long-running agent loops, stateless deployments, and memory hygiene.

Provider stream errors now reliably surface from generate() and resumeGenerate()

Certain provider streaming errors could previously get swallowed, leading to empty responses that looked successful. Now, generate() and resumeGenerate() consistently throw provider stream errors, which makes retry logic and error handling far more dependable.

 1try {
 2  const result = await agent.generate({ prompt: "Do the thing" });
 3  // If the provider stream fails, you now reliably land in catch.
 4  return result;
 5} catch (err) {
 6  // Implement your retry/backoff/reporting here
 7  throw err;
 8}

(PR #13802)

AI SDK errors are routed through the Mastra logger with structured context

LLM errors from generateText, generateObject, streamText, and streamObject are no longer swallowed by the AI SDK default handler. They now show up in your Mastra logger with structured fields (runId, modelId, provider, etc.), and streaming errors are captured via onError callbacks.

If you rely on logs and traces to debug production runs, this change makes failures much easier to triage. (PR #13857)

Client-side tools no longer lose history in stateless deployments

If you were using client-side tools without a threadId in a stateless server setup, recursive calls after tool execution could drop conversation context, effectively causing “amnesia”. Now, Mastra ensures the full conversation history is included when no thread id is present, so multi-step tool flows stay coherent.

This is a big improvement for quick deployments where you do not want to persist threads server-side. (PR #11476)

memory.deleteThread() and deleteMessages() now clean up orphaned vector embeddings

When deleting threads or messages, it’s easy to forget that vector embeddings created for retrieval can remain in your vector store. Mastra now automatically cleans up associated embeddings across supported vector backends when you call memory.deleteThread() or memory.deleteMessages().

Cleanup is non-blocking, so deletes stay fast, and you avoid slowly accumulating unused vectors over time. This also fixes cases where updateMessages did not fully clean up old vectors when using a non-default index separator (for example, Pinecone). (PR #12227)

Breaking Changes

Skill tool names are now stable across conversation turns and prompt-cache friendly (PR #13744): Several skill-related tool ids were renamed and consolidated. If you were calling these tools directly (or matching on tool ids in logs/telemetry), update your integrations accordingly:

  • skill-activate is now skill (it returns full skill instructions directly in the tool result)
  • skill-read-reference, skill-read-script, skill-read-asset are now consolidated into skill_read
  • skill-search is now skill_search

If you had code branching on tool names, migrate to the new ids:

 1// Before
 2const SKILL_TOOL = "skill-activate";
 3const SKILL_SEARCH_TOOL = "skill-search";
 4
 5// After
 6const SKILL_TOOL = "skill";
 7const SKILL_SEARCH_TOOL = "skill_search";
 8const SKILL_READ_TOOL = "skill_read";

Other Notable Updates

  • execute_command timeout units fixed: The execute_command tool timeout parameter now accepts seconds (not milliseconds), preventing accidentally tiny timeouts. (PR #13799)
  • Cloudflare Workers compatibility: Fixed Workers build failures by lazily loading the local process execution runtime dependency, avoiding bundling Node-only modules. (PR #13813)
  • File attachment routing fix: Corrected a mimeType to mediaType typo so file parts use the V5 adapter properly with AI SDK v5 providers. (PR #13833)
  • onIterationComplete feedback preserved: Feedback returned with { continue: false } is no longer discarded. It is added to the conversation, then the model gets one final turn before stopping. (PR #13759)
  • stopWhen can fully control sub-agent execution: Removed the default maxSteps limit so stopWhen governs execution (paired with safeguards to avoid unbounded loops when no stop condition is set). (PR #13764)
  • Suspended tool runs edge case fixed: Resolved a suspendedToolRunId required error when it should not be required. (PR #13722)
  • Prompt reliability improvement: Assistant messages that only contain sources are removed before model calls to prevent prompt failures. (PR #13790)
  • RequestContext construction hardened: Fixed crashes when constructing RequestContext from a deserialized plain object. (PR #13856)
  • Workspace output truncation fixed: Tool output is no longer prematurely cut off when short lines precede a very long line, improving completeness for minified JSON and similar outputs. (PR #13828)
  • Dependency updates: p-map updated to ^7.0.4. (PR #13209)
  • Dependency updates: p-retry updated to ^7.1.1. (PR #13210)

That's all for @mastra/core@1.10.0!

Happy building! 🚀

Share:
Shane Thomas

Shane Thomas is the founder and CPO of Mastra. He co-hosts AI Agents Hour, a weekly show covering news and topics around AI agents. Previously, he was in product and engineering at Netlify and Gatsby. He created the first course as an MCP server and is kind of a musician.

All articles by Shane Thomas