We've been busy smoothing out a few sharp edges and adding some genuinely useful building blocks for production apps. If you care about evals, auth, or keeping costs under control while scaling context sizes, there’s a lot to like here.
Release: @mastra/core@1.16.0
Let's dive in:
Smarter Model Selection for Observational Memory
Observational Memory (OM) now supports token-threshold-based routing, so you can automatically choose a fast, cheaper model for small inputs and a stronger model when the context gets large.
That makes it much easier to keep OM inexpensive in the common case, without sacrificing quality for the heavy calls where it actually matters. It also makes behavior more predictable, since you’re configuring explicit thresholds instead of hoping a single model choice works well for every input size.
Here’s what the declarative setup looks like using ModelByInputTokens:
1import { Memory, ModelByInputTokens } from "@mastra/memory";
2
3const memory = new Memory({
4 options: {
5 observationalMemory: {
6 model: new ModelByInputTokens({
7 upTo: {
8 10_000: "google/gemini-2.5-flash",
9 40_000: "openai/gpt-4o",
10 1_000_000: "openai/gpt-4.5"
11 }
12 })
13 }
14 }
15});A couple details worth knowing:
- The
upTokeys are inclusive upper bounds. - Model resolution happens at the observer or reflector call site, using the actual input token count.
- If the input exceeds your largest configured threshold, OM throws an error so you can handle it explicitly.
Tracing also got better here, you can now see clearer observer and reflector spans and which model got selected at runtime, along with the rationale. (PR #14614)
MongoDB Support for Datasets and Experiments
If you’re running Mastra on MongoDB, you can now store datasets and experiments directly in @mastra/mongodb, including full dataset item history and versioned "time travel" queries.
This unlocks a much nicer evaluation workflow for teams already standardized on MongoDB. You get versioned datasets (including item history), plus experiment CRUD and per-item experiment results, without needing to stand up a separate storage backend.
And if you’re already using MongoDBStore, this works automatically without additional configuration.
1import { MongoDBStore } from "@mastra/mongodb";
2
3const store = new MongoDBStore({
4 uri: "mongodb://localhost:27017",
5 dbName: "my-app"
6});
7
8// Datasets
9const dataset = await store.getStorage("datasets").createDataset({ name: "my-dataset" });
10await store.getStorage("datasets").addItem({
11 datasetId: dataset.id,
12 input: { prompt: "hello" }
13});
14
15// Experiments
16const experiment = await store.getStorage("experiments").createExperiment({
17 name: "run-1",
18 datasetId: dataset.id
19});You can create, update, and delete datasets and items with automatic version tracking, plus batch operations and item history tracking, which is especially useful when you need to reproduce an evaluation run later. (PR #14556)
Okta Auth and RBAC
There’s a new @mastra/auth-okta package for adding Okta-based SSO and role-based access control to your Mastra deployment.
This is aimed at the most common enterprise need: central identity, group-based permissioning, and reliable JWT verification against Okta’s JWKS endpoint. You can map Okta groups to Mastra permissions, manage sessions, and also mix-and-match, for example pairing Okta RBAC with a different authentication provider.
This release also includes a security hardening pass to improve defaults and address code review feedback, including:
- avoiding cache poisoning by ensuring group-fetch errors propagate and allow retries,
- keeping cookies under browser limits by not storing access/refresh tokens,
- adding
id_token_hintsupport for Okta logout, - better production warnings and documentation for required env vars and options.
Breaking Changes
No breaking changes were called out in this changelog.
Other Notable Updates
-
Evaluate workflow upgrades: Added dataset targeting (associate datasets with agents, scorers, or workflows via
targetTypeandtargetIds), experiment resultstatus(needs-review,reviewed,complete), new dataset-driven experiment routes, LLM dataset item generation endpoints, and LLM-assisted failure clustering and tag proposals (PR #14470) -
Pin experiments to an agent version: You can now pass
agentVersionwhen triggering experiments, it’s stored and returned in responses to make runs reproducible across agent changes (PR #14562)1import { MastraClient } from "@mastra/client-js"; 2 3const client = new MastraClient(); 4 5await client.triggerDatasetExperiment({ 6 datasetId: "my-dataset", 7 targetType: "agent", 8 targetId: "my-agent", 9 version: 3, // pin to dataset version 3 10 agentVersion: "ver_abc123" // pin to a specific agent version 11}); -
Harness tool suspension handling: If a tool calls
suspend()during execution, the harness emits atool_suspendedevent, reportsagent_endwith reasonsuspended, and exposesrespondToToolSuspension()for resuming with user-provided data (PR #14611)1harness.subscribe((event) => { 2 if (event.type === "tool_suspended") { 3 // event.toolName, event.suspendPayload, event.resumeSchema 4 } 5}); 6 7await harness.respondToToolSuspension({ 8 resumeData: { confirmed: true } 9}); -
Agent tool context now includes
agentId: Tools can readcontext.agent.agentId, useful for per-agent behavior, shared config lookup, or metadata-driven tool execution (PR #14502) -
Observability storage improvements: Added typed storage fields for correlation context and cost, richer metric aggregations (including estimated cost), and improved filter parity across logs and metrics (PR #14607)
-
New observability APIs and client methods: Logs, scores, feedback, metrics (aggregate, breakdown, time series, percentiles), plus discovery endpoints and client helpers (PR #14470)
-
Workspace skills disambiguation: Optional
?path=query parameter for same-named skills,SkillMetadata.pathis now included, andlist()now returns all same-named skills to help UIs and agents disambiguate (PR #14430) -
Server adapter auth helpers: Added
createAuthMiddleware({ mastra })for mounting raw framework routes while still running Mastra auth middleware, with optionalrequiresAuth: falsefor public endpoints (PR #14458) -
getAuthenticatedUser()for middleware: Server middleware can resolve the configured auth user without changing route auth behavior (PR #14458) -
Metrics dashboard storage detection:
/system/packagesnow returnsobservabilityStorageType, helping the UI and your app understand whether metrics are supported and whether they persist across restarts (PR #14620) -
Harness teardown fix:
Harness.destroy()now cleans up heartbeats and workspace properly (PR #14568) -
Tool input validation fix: Null detection now checks actual failing values instead of relying on error message string matching, improving robustness with LLM-provided nulls (PR #14496)
-
Tracing fix for streaming: Tool lists are now included in agent traces for streaming runs (useful for exporters like Datadog LLM Observability) (PR #14550)
-
Sequential tool-only loop fix: Inserts a
step-startboundary between consecutive tool-only iterations so models don’t misinterpret them as parallel calls (PR #14652) -
Anthropic tool ordering fix: Correctly splits tool blocks when client tools and provider tools run in parallel to avoid unrecoverable Anthropic message-history errors (PR #14648)
-
Zod v3 and v4 compatibility: Public structured-output APIs now work cleanly with either
zod/v3orzod/v4, matching the peer dependency range across packages (PR #14464) -
Schema-compat ESM import fix: Removed
createRequireusage in the Zod v4 adapter to avoid ESM issues while preserving v3 and v4 support (PR #14617)
That's all for @mastra/core@1.16.0!
Happy building! 🚀
