We've been busy working on features that make it easier to evaluate agent quality, run larger workflows with better visibility, and safely integrate filesystem-backed workspaces in real apps.
Release: @mastra/core@1.4.0
We prepared automated codemods for most breaking changes. Run all v1 codemods at once:
1npx @mastra/codemod@latest v1See the migration guide for detailed instructions.
Let's dive in:
Datasets & Experiments (core + server + Studio UI)
Mastra now includes first-class evaluation primitives: versioned Datasets and runnable Experiments, designed to help you measure quality over time instead of relying on ad hoc spot checks.
Datasets are collections of evaluation items (validated with JSON Schema) with SCD-2 style item versioning, so you can update test cases without losing history. Experiments let you run agents against dataset items, score the outputs with configurable scorers, and track results over time.
At the core level, you get new exports under @mastra/core/datasets:
DatasetsManagerto orchestrate dataset CRUD, item versioning, and experiment executionDatasetas a single-dataset handle for adding items and running experiments
There are also new storage domains to support persistence and result tracking:
DatasetsStoragefor datasets, items, and versionsExperimentsStoragefor experiment lifecycle and results
Here’s what creating a dataset and running an experiment looks like:
1import { Mastra } from "@mastra/core";
2
3const mastra = new Mastra({
4 /* ... */
5});
6
7const dataset = await mastra.datasets.create({ name: "my-eval-set" });
8
9await dataset.addItems([
10 {
11 input: { query: "What is 2+2?" },
12 groundTruth: { answer: "4" }
13 }
14]);
15
16const result = await dataset.runExperiment({
17 targetType: "agent",
18 targetId: "my-agent",
19 scorerIds: ["accuracy"]
20});If you are using @mastra/server, there are new REST routes under /datasets that cover full CRUD for datasets, items, versions, experiments, and experiment results, including batch operations and experiment comparison. That makes it straightforward to integrate evaluations into CI, internal tooling, or multi-tenant backends without reinventing endpoints.
And for teams that prefer a UI-first workflow, Mastra Studio ships an end-to-end experience for dataset management (including CSV/JSON import and export), browsing and comparing SCD-2 versions, triggering experiments with scorer selection, and comparing results (including score deltas and trace visualization). (PR #12747)
Workspace & Filesystem Lifecycle + Safer Filesystem Introspection
Workspaces often sit at the boundary between your agents and your real environment, so lifecycle clarity and safe introspection matter a lot, especially when you run in different deployment modes (contained vs uncontained filesystems, read-only mounts, remote providers, etc.).
In 1.4.0, lifecycle types were split to match how providers actually behave:
FilesystemLifecycleis a simple two-phase lifecycle:init()thendestroy()SandboxLifecycleis a three-phase lifecycle:start()thenstop()thendestroy()
The original Lifecycle type is still exported for backward compatibility, but the new interfaces make it easier to implement providers correctly and avoid mixing concepts that do not apply everywhere. (PR #12978)
On top of that, MastraFilesystem now supports onInit and onDestroy callbacks via MastraFilesystemOptions, consistent with the existing MastraSandbox callback pattern. This makes it easier to hook into readiness and teardown without needing custom glue code:
1const fs = new LocalFilesystem({
2 basePath: "./data",
3 onInit: ({ filesystem }) => {
4 console.log("Filesystem ready:", filesystem.status);
5 },
6 onDestroy: ({ filesystem }) => {
7 console.log("Cleaning up...");
8 }
9});Filesystem introspection also got safer and more precise:
LocalFilesystem.resolvePathnow correctly handles absolute paths. Previously, leading slashes could be stripped and resolved relative tobasePath, which could trigger surprisingPermissionErrors for valid paths.FilesystemInfois now generic (FilesystemInfo<TMetadata>), so providers can strongly type their metadata.- Provider-specific fields like
basePathandcontainedwere moved into provider metadata returned byLocalFilesystem.getInfo(). LocalFilesystem.getInstructions()now warns agents more explicitly when a filesystem is uncontained (for example, to avoid listing/).- Workspace API responses now expose filesystem info in
GET /api/workspaces/:id, including provider type, status, readOnly, and provider-specific metadata.
Together, these changes make filesystem-backed skills and tooling more predictable, and they reduce the chance of accidentally letting an agent roam around the host filesystem. (PR #12971)
Workflow foreach Progress Streaming
foreach steps are a great fit for batch operations, fan-out workflows, and dataset style processing, but they are also one of the easiest places to lose visibility. You might know a workflow is running, but not how far along it is or which iteration failed.
Workflows now emit a workflow-step-progress stream event for foreach steps, including:
completedCounttotalCountcurrentIndexiterationStatus(success|failed|suspended)- optional
iterationOutput
Both the default and evented workflow execution engines emit these events, and Studio renders real-time progress bars for foreach nodes.
If you are consuming streams yourself, you can now surface progress in logs or your own UI:
1const run = workflow.createRun();
2const result = await run.start({ inputData });
3const stream = result.stream;
4
5for await (const chunk of stream) {
6 if (chunk.type === "workflow-step-progress") {
7 console.log(`${chunk.payload.completedCount}/${chunk.payload.totalCount} - ${chunk.payload.iterationStatus}`);
8 }
9}On the frontend side, @mastra/react watch hooks now accumulate foreachProgress into step state, so your UI can stay in sync without manual bookkeeping. (PR #12838)
Breaking Changes
Observational Memory observe() signature changed (PR #12925): In @mastra/memory, observe() now takes a single object parameter instead of positional arguments. Update calls from observe(threadId, resourceId) to observe({ threadId, resourceId }).
1// Before
2await om.observe(threadId, resourceId);
3
4// After
5await om.observe({ threadId, resourceId });As part of the same work, @mastra/memory also introduced a standalone observe() API that can accept external messages directly, plus new lifecycle hooks (ObserveHooks) and additional exports for controlling how the observation context is presented to models.
Other Notable Updates
- Cleaner streaming with completion validation: Added
completion.suppressFeedbackso you can hide internal completion-check messages from the stream and keep conversation history clean (default behavior unchanged) (PR #12764) - MCP client persistence: Added a new
mcpClientsstorage domain for storing MCP client configurations (including multiple servers with independent tool selection), implemented in LibSQL, PostgreSQL, and MongoDB adapters (PR #12838) - Third-party tool catalogs: Added a
ToolProviderinterface at@mastra/core/tool-providerfor integrating external tool catalogs (for example, Composio, Arcade AI), including request-scoped credentials viarequestContextinresolveTools()(PR #12838) - Safer agent tool loops: Tool-not-found errors no longer crash the agentic loop. When a model hallucinates a tool name, the error is returned as a tool result so the model can self-correct (includes available tool names) (PR #12961)
- Anthropic persistence fixes: Filter out empty assistant text blocks before persistence to prevent Anthropic API rejections during streaming with citations (PR #12711)
- Structured output + memory reliability: Fixed Anthropic structured output failures when memory is enabled by ensuring prompts do not end in an assistant role message (PR #12835)
- Better failure visibility: Improved error messages for processor workflow failures and model fallback exhaustion by including the last error message and underlying causes (PR #12970)
- Server adapter correctness: Fixed custom routes registered via
registerApiRoute()being ignored at runtime across Koa, Express, Fastify, and Hono adapters (they appeared in OpenAPI but returned 404 previously) (PR #12960) - Server tools endpoint fix: Fixed
/api/toolsreturning an empty list even when tools are registered (PR #13008) - Tool provider and stored MCP REST APIs: Added server routes for
/api/stored-mcp-clientsand/api/tool-providersfor discovery and browsing (PR #12974) - Agents list completeness: Fixed
requestContextSchemamissing from the agent list API response (PR #12954)
That's all for @mastra/core@1.4.0!
Happy building! 🚀
