ResponseCache

ResponseCache is an input processor that caches LLM responses on the request/response boundary inside the agentic loop. It hooks into processLLMRequest (cache lookup; short-circuits on hit) and processLLMResponse (cache write on completion).

The cache key is derived from the resolved LanguageModelV2Prompt Mastra is about to send to the model — i.e. after memory has loaded and earlier input processors have transformed the prompt — so two users with different memory contexts produce different cache keys. Each step in an agentic tool loop is independently cached.

There is no agent-level option for response caching; register ResponseCache explicitly on inputProcessors. Per-call overrides flow through RequestContext via ResponseCache.context() and ResponseCache.applyContext().

Usage example
Direct link to Usage example

import { Agent } from '@mastra/core/agent'
import { InMemoryServerCache } from '@mastra/core/cache'
import { ResponseCache } from '@mastra/core/processors'

const cache = new InMemoryServerCache()

const agent = new Agent({
  name: 'Search Agent',
  instructions: 'You answer questions concisely.',
  model: 'openai/gpt-5',
  inputProcessors: [new ResponseCache({ cache, ttl: 600 })],
})

// First call hits the LLM and writes to the cache.
await agent.generate('What is the capital of France?')

// Second identical call replays the cached response.
await agent.generate('What is the capital of France?')

// Force a fresh call but still update the cache.
await agent.generate('What is the capital of France?', {
  requestContext: ResponseCache.context({ bust: true }),
})

See Response caching for the conceptual overview, scoping rules, and recommended deployment patterns.

Constructor parameters
Direct link to Constructor parameters

cache:

MastraServerCache

The cache backend. Required. Pass any `MastraServerCache` implementation — `InMemoryServerCache` for local development, `RedisCache` from `@mastra/redis` for production, or your own subclass for a custom backend.

ttl?:

number

= 300

Time-to-live (seconds) for entries written by this processor. Defaults to 300 seconds (5 minutes), matching OpenRouter's reference implementation.

scope?:

string | null

Tenant scope appended to the cache key. `null` opts out of scoping. When omitted, the processor falls back to the resource id resolved from the request context (`MASTRA_RESOURCE_ID_KEY`) for automatic per-user isolation.

key?:

string | (inputs: ResponseCacheKeyInputs) => string | Promise<string>

Override the auto-derived cache key. Pass a string to pin a key, or a function that receives `{ agentId, scope, model, prompt, stepNumber }` and returns a key. If the function throws, the processor falls back to the deterministic hash so the call still benefits from caching.

bust?:

boolean

= false

Force a cache miss on every call: skip the read but still write on completion. Useful for explicit refresh paths.

agentId?:

string

= 'mastra-response-cache'

Logical id used in the cache key namespace. Defaults to `'mastra-response-cache'`. Set this to the owning agent's id when you want cache entries scoped per-agent.

Static helpers
Direct link to Static helpers

ResponseCache exposes two static helpers for setting per-call overrides on a RequestContext. The helpers keep the underlying context key a private implementation detail — prefer them over reading/writing the raw key.

`ResponseCache.context(options)`
Direct link to responsecachecontextoptions

Build a fresh RequestContext preloaded with per-call response cache overrides.

await agent.stream('hello', {
  requestContext: ResponseCache.context({ key: 'custom', bust: true }),
})

`ResponseCache.applyContext(requestContext, options)`
Direct link to responsecacheapplycontextrequestcontext-options

Merge per-call response cache overrides into an existing RequestContext. Returns the same context for chaining.

const ctx = new RequestContext()
ctx.set('caller-meta', { userId: 'u-123' })
ResponseCache.applyContext(ctx, { bust: true })
await agent.stream('hello', { requestContext: ctx })

ResponseCacheContextOptions
Direct link to ResponseCacheContextOptions

The shape passed to ResponseCache.context() / ResponseCache.applyContext().

key?:

string | (inputs: ResponseCacheKeyInputs) => string | Promise<string>

Overrides the auto-derived cache key for this request only.

scope?:

string | null

Overrides the tenant scope for this request only. `null` opts out of scoping.

bust?:

boolean

Skip the cache read but still write on completion.

cache, ttl, and agentId are intentionally not overridable per call — they are instance-level concerns that should not vary per request.

ResponseCacheKeyInputs
Direct link to ResponseCacheKeyInputs

The argument passed to a key function (constructor or per-call). All fields contribute to the deterministic hash by default.

agentId:

string

Logical processor id used to namespace the cache key.

scope?:

string | null | undefined

Resolved scope for this request, or `null` when scoping is disabled.

model:

{ provider?: string; modelId?: string; specVersion?: string }

Provider/model identity. Different models produce different responses.

prompt:

LanguageModelV2Prompt

The exact prompt the provider would receive, post memory load and post any prompt-modifying input processors.

stepNumber:

number

0-indexed step number within the agentic loop. Greater than zero for tool steps.

Helper exports
Direct link to Helper exports

buildResponseCacheKey(inputs) — the deterministic hash used by default. Re-export it to override individual fields while preserving the rest of the standard key shape.
DEFAULT_RESPONSE_CACHE_TTL_SECONDS — the default ttl (300).
RESPONSE_CACHE_CONTEXT_KEY — the RequestContext key the static helpers write to. Exposed for advanced cases (e.g. clearing the override mid-pipeline); prefer the helpers.

Usage exampleDirect link to Usage example

Constructor parametersDirect link to Constructor parameters

cache:

ttl?:

scope?:

key?:

bust?:

agentId?:

Static helpersDirect link to Static helpers

ResponseCache.context(options)Direct link to responsecachecontextoptions

ResponseCache.applyContext(requestContext, options)Direct link to responsecacheapplycontextrequestcontext-options

ResponseCacheContextOptionsDirect link to ResponseCacheContextOptions

key?:

scope?:

bust?:

ResponseCacheKeyInputsDirect link to ResponseCacheKeyInputs

agentId:

scope?:

model:

prompt:

stepNumber:

Helper exportsDirect link to Helper exports

RelatedDirect link to Related

Usage example
Direct link to Usage example

Constructor parameters
Direct link to Constructor parameters

Static helpers
Direct link to Static helpers

`ResponseCache.context(options)`
Direct link to responsecachecontextoptions

`ResponseCache.applyContext(requestContext, options)`
Direct link to responsecacheapplycontextrequestcontext-options

ResponseCacheContextOptions
Direct link to ResponseCacheContextOptions

ResponseCacheKeyInputs
Direct link to ResponseCacheKeyInputs

Helper exports
Direct link to Helper exports

Related
Direct link to Related