Mastra Changelog 2026-02-13

We've been busy working on features that make it easier to evaluate agent quality, run larger workflows with better visibility, and safely integrate filesystem-backed workspaces in real apps.

Release: @mastra/core@1.4.0

We prepared automated codemods for most breaking changes. Run all v1 codemods at once:

 1npx @mastra/codemod@latest v1

See the migration guide for detailed instructions.

Let's dive in:

Datasets & Experiments (core + server + Studio UI)

Mastra now includes first-class evaluation primitives: versioned Datasets and runnable Experiments, designed to help you measure quality over time instead of relying on ad hoc spot checks.

Datasets are collections of evaluation items (validated with JSON Schema) with SCD-2 style item versioning, so you can update test cases without losing history. Experiments let you run agents against dataset items, score the outputs with configurable scorers, and track results over time.

At the core level, you get new exports under @mastra/core/datasets:

DatasetsManager to orchestrate dataset CRUD, item versioning, and experiment execution
Dataset as a single-dataset handle for adding items and running experiments

There are also new storage domains to support persistence and result tracking:

DatasetsStorage for datasets, items, and versions
ExperimentsStorage for experiment lifecycle and results

Here’s what creating a dataset and running an experiment looks like:

 1import { Mastra } from "@mastra/core";
 2
 3const mastra = new Mastra({
 4  /* ... */
 5});
 6
 7const dataset = await mastra.datasets.create({ name: "my-eval-set" });
 8
 9await dataset.addItems([
10  {
11    input: { query: "What is 2+2?" },
12    groundTruth: { answer: "4" }
13  }
14]);
15
16const result = await dataset.runExperiment({
17  targetType: "agent",
18  targetId: "my-agent",
19  scorerIds: ["accuracy"]
20});

If you are using @mastra/server, there are new REST routes under /datasets that cover full CRUD for datasets, items, versions, experiments, and experiment results, including batch operations and experiment comparison. That makes it straightforward to integrate evaluations into CI, internal tooling, or multi-tenant backends without reinventing endpoints.

And for teams that prefer a UI-first workflow, Mastra Studio ships an end-to-end experience for dataset management (including CSV/JSON import and export), browsing and comparing SCD-2 versions, triggering experiments with scorer selection, and comparing results (including score deltas and trace visualization). (PR #12747)

Workspace & Filesystem Lifecycle + Safer Filesystem Introspection

Workspaces often sit at the boundary between your agents and your real environment, so lifecycle clarity and safe introspection matter a lot, especially when you run in different deployment modes (contained vs uncontained filesystems, read-only mounts, remote providers, etc.).

In 1.4.0, lifecycle types were split to match how providers actually behave:

FilesystemLifecycle is a simple two-phase lifecycle: init() then destroy()
SandboxLifecycle is a three-phase lifecycle: start() then stop() then destroy()

The original Lifecycle type is still exported for backward compatibility, but the new interfaces make it easier to implement providers correctly and avoid mixing concepts that do not apply everywhere. (PR #12978)

On top of that, MastraFilesystem now supports onInit and onDestroy callbacks via MastraFilesystemOptions, consistent with the existing MastraSandbox callback pattern. This makes it easier to hook into readiness and teardown without needing custom glue code:

 1const fs = new LocalFilesystem({
 2  basePath: "./data",
 3  onInit: ({ filesystem }) => {
 4    console.log("Filesystem ready:", filesystem.status);
 5  },
 6  onDestroy: ({ filesystem }) => {
 7    console.log("Cleaning up...");
 8  }
 9});

Filesystem introspection also got safer and more precise:

LocalFilesystem.resolvePath now correctly handles absolute paths. Previously, leading slashes could be stripped and resolved relative to basePath, which could trigger surprising PermissionErrors for valid paths.
FilesystemInfo is now generic (FilesystemInfo<TMetadata>), so providers can strongly type their metadata.
Provider-specific fields like basePath and contained were moved into provider metadata returned by LocalFilesystem.getInfo().
LocalFilesystem.getInstructions() now warns agents more explicitly when a filesystem is uncontained (for example, to avoid listing /).
Workspace API responses now expose filesystem info in GET /api/workspaces/:id, including provider type, status, readOnly, and provider-specific metadata.

Together, these changes make filesystem-backed skills and tooling more predictable, and they reduce the chance of accidentally letting an agent roam around the host filesystem. (PR #12971)

Workflow foreach Progress Streaming

foreach steps are a great fit for batch operations, fan-out workflows, and dataset style processing, but they are also one of the easiest places to lose visibility. You might know a workflow is running, but not how far along it is or which iteration failed.

Workflows now emit a workflow-step-progress stream event for foreach steps, including:

completedCount
totalCount
currentIndex
iterationStatus (success | failed | suspended)
optional iterationOutput

Both the default and evented workflow execution engines emit these events, and Studio renders real-time progress bars for foreach nodes.

If you are consuming streams yourself, you can now surface progress in logs or your own UI:

 1const run = workflow.createRun();
 2const result = await run.start({ inputData });
 3const stream = result.stream;
 4
 5for await (const chunk of stream) {
 6  if (chunk.type === "workflow-step-progress") {
 7    console.log(`${chunk.payload.completedCount}/${chunk.payload.totalCount} - ${chunk.payload.iterationStatus}`);
 8  }
 9}

On the frontend side, @mastra/react watch hooks now accumulate foreachProgress into step state, so your UI can stay in sync without manual bookkeeping. (PR #12838)

Breaking Changes

Observational Memory observe() signature changed (PR #12925): In @mastra/memory, observe() now takes a single object parameter instead of positional arguments. Update calls from observe(threadId, resourceId) to observe({ threadId, resourceId }).

 1// Before
 2await om.observe(threadId, resourceId);
 3
 4// After
 5await om.observe({ threadId, resourceId });

As part of the same work, @mastra/memory also introduced a standalone observe() API that can accept external messages directly, plus new lifecycle hooks (ObserveHooks) and additional exports for controlling how the observation context is presented to models.

Other Notable Updates

Cleaner streaming with completion validation: Added completion.suppressFeedback so you can hide internal completion-check messages from the stream and keep conversation history clean (default behavior unchanged) (PR #12764)
MCP client persistence: Added a new mcpClients storage domain for storing MCP client configurations (including multiple servers with independent tool selection), implemented in LibSQL, PostgreSQL, and MongoDB adapters (PR #12838)
Third-party tool catalogs: Added a ToolProvider interface at @mastra/core/tool-provider for integrating external tool catalogs (for example, Composio, Arcade AI), including request-scoped credentials via requestContext in resolveTools() (PR #12838)
Safer agent tool loops: Tool-not-found errors no longer crash the agentic loop. When a model hallucinates a tool name, the error is returned as a tool result so the model can self-correct (includes available tool names) (PR #12961)
Anthropic persistence fixes: Filter out empty assistant text blocks before persistence to prevent Anthropic API rejections during streaming with citations (PR #12711)
Structured output + memory reliability: Fixed Anthropic structured output failures when memory is enabled by ensuring prompts do not end in an assistant role message (PR #12835)
Better failure visibility: Improved error messages for processor workflow failures and model fallback exhaustion by including the last error message and underlying causes (PR #12970)
Server adapter correctness: Fixed custom routes registered via registerApiRoute() being ignored at runtime across Koa, Express, Fastify, and Hono adapters (they appeared in OpenAPI but returned 404 previously) (PR #12960)
Server tools endpoint fix: Fixed /api/tools returning an empty list even when tools are registered (PR #13008)
Tool provider and stored MCP REST APIs: Added server routes for /api/stored-mcp-clients and /api/tool-providers for discovery and browsing (PR #12974)
Agents list completeness: Fixed requestContextSchema missing from the agent list API response (PR #12954)

That's all for @mastra/core@1.4.0!

Happy building! 🚀