# Observational Memory **Added in:** `@mastra/memory@1.1.0` Observational Memory (OM) is Mastra's memory system for long-context agentic memory. Two background agents — an **Observer** that watches conversations and creates observations, and a **Reflector** that restructures observations by combining related items, reflecting on overarching patterns, and condensing where possible — maintain an observation log that replaces raw message history as it grows. ## Usage ```typescript import { Memory } from "@mastra/memory"; import { Agent } from "@mastra/core/agent"; export const agent = new Agent({ name: "my-agent", instructions: "You are a helpful assistant.", model: "openai/gpt-5-mini", memory: new Memory({ options: { observationalMemory: true, }, }), }); ``` ## Configuration The `observationalMemory` option accepts `true`, `false`, or a configuration object. Setting `observationalMemory: true` enables it with all defaults. Setting `observationalMemory: false` or omitting it disables it. **enabled?:** (`boolean`): Enable or disable Observational Memory. When omitted from a config object, defaults to \`true\`. Only \`enabled: false\` explicitly disables it. (Default: `true`) **model?:** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for both the Observer and Reflector agents. Sets the model for both at once. Cannot be used together with \`observation.model\` or \`reflection.model\` — an error will be thrown if both are set. (Default: `'google/gemini-2.5-flash'`) **scope?:** (`'resource' | 'thread'`): Memory scope for observations. \`'thread'\` keeps observations per-thread. \`'resource'\` shares observations across all threads for a resource, enabling cross-conversation memory. (Default: `'thread'`) **shareTokenBudget?:** (`boolean`): Share the token budget between messages and observations. When enabled, the total budget is \`observation.messageTokens + reflection.observationTokens\`. Messages can use more space when observations are small, and vice versa. This maximizes context usage through flexible allocation. (Default: `false`) **observation?:** (`ObservationalMemoryObservationConfig`): Configuration for the observation step. Controls when the Observer agent runs and how it behaves. **reflection?:** (`ObservationalMemoryReflectionConfig`): Configuration for the reflection step. Controls when the Reflector agent runs and how it behaves. ### Observation config **model?:** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for the Observer agent. Cannot be set if a top-level \`model\` is also provided. (Default: `'google/gemini-2.5-flash'`) **messageTokens?:** (`number`): Token count of unobserved messages that triggers observation. When unobserved message tokens exceed this threshold, the Observer agent is called. (Default: `30000`) **maxTokensPerBatch?:** (`number`): Maximum tokens per batch when observing multiple threads in resource scope. Threads are chunked into batches of this size and processed in parallel. Lower values mean more parallelism but more API calls. (Default: `10000`) **modelSettings?:** (`ObservationalMemoryModelSettings`): Model settings for the Observer agent. (Default: `{ temperature: 0.3, maxOutputTokens: 100_000 }`) ### Reflection config **model?:** (`string | LanguageModel | DynamicModel | ModelWithRetries[]`): Model for the Reflector agent. Cannot be set if a top-level \`model\` is also provided. (Default: `'google/gemini-2.5-flash'`) **observationTokens?:** (`number`): Token count of observations that triggers reflection. When observation tokens exceed this threshold, the Reflector agent is called to condense them. (Default: `40000`) **modelSettings?:** (`ObservationalMemoryModelSettings`): Model settings for the Reflector agent. (Default: `{ temperature: 0, maxOutputTokens: 100_000 }`) ### Model settings **temperature?:** (`number`): Temperature for generation. Lower values produce more consistent output. (Default: `0.3`) **maxOutputTokens?:** (`number`): Maximum output tokens. Set high to prevent truncation of observations. (Default: `100000`) ## Examples ### Resource scope with custom thresholds ```typescript import { Memory } from "@mastra/memory"; import { Agent } from "@mastra/core/agent"; export const agent = new Agent({ name: "my-agent", instructions: "You are a helpful assistant.", model: "openai/gpt-5-mini", memory: new Memory({ options: { observationalMemory: { scope: "resource", observation: { messageTokens: 20_000, }, reflection: { observationTokens: 60_000, }, }, }, }), }); ``` ### Shared token budget ```typescript import { Memory } from "@mastra/memory"; import { Agent } from "@mastra/core/agent"; export const agent = new Agent({ name: "my-agent", instructions: "You are a helpful assistant.", model: "openai/gpt-5-mini", memory: new Memory({ options: { observationalMemory: { shareTokenBudget: true, observation: { messageTokens: 20_000, }, reflection: { observationTokens: 80_000, }, }, }, }), }); ``` When `shareTokenBudget` is enabled, the total budget is `observation.messageTokens + reflection.observationTokens` (100k in this example). If observations only use 30k tokens, messages can expand to use up to 70k. If messages are short, observations have more room before triggering reflection. ### Custom model ```typescript import { Memory } from "@mastra/memory"; import { Agent } from "@mastra/core/agent"; export const agent = new Agent({ name: "my-agent", instructions: "You are a helpful assistant.", model: "openai/gpt-5-mini", memory: new Memory({ options: { observationalMemory: { model: "openai/gpt-4o-mini", }, }, }), }); ``` ### Different models per agent ```typescript import { Memory } from "@mastra/memory"; import { Agent } from "@mastra/core/agent"; export const agent = new Agent({ name: "my-agent", instructions: "You are a helpful assistant.", model: "openai/gpt-5-mini", memory: new Memory({ options: { observationalMemory: { observation: { model: "google/gemini-2.5-flash", }, reflection: { model: "openai/gpt-4o-mini", }, }, }, }), }); ``` ## Standalone usage Most users should use the `Memory` class above. Using `ObservationalMemory` directly is mainly useful for benchmarking, experimentation, or when you need to control processor ordering with other processors (like [guardrails](https://mastra.ai/docs/agents/guardrails)). ```typescript import { ObservationalMemory } from "@mastra/memory/processors"; import { Agent } from "@mastra/core/agent"; import { LibSQLStore } from "@mastra/libsql"; const storage = new LibSQLStore({ id: "my-storage", url: "file:./memory.db", }); const om = new ObservationalMemory({ storage: storage.stores.memory, model: "google/gemini-2.5-flash", scope: "resource", observation: { messageTokens: 20_000, }, reflection: { observationTokens: 60_000, }, }); export const agent = new Agent({ name: "my-agent", instructions: "You are a helpful assistant.", model: "openai/gpt-5-mini", inputProcessors: [om], outputProcessors: [om], }); ``` ### Standalone config The standalone `ObservationalMemory` class accepts all the same options as the `observationalMemory` config object above, plus the following: **storage:** (`MemoryStorage`): Storage adapter for persisting observations. Must be a MemoryStorage instance (from \`MastraStorage.stores.memory\`). **onDebugEvent?:** (`(event: ObservationDebugEvent) => void`): Debug callback for observation events. Called whenever observation-related events occur. Useful for debugging and understanding the observation flow. **obscureThreadIds?:** (`boolean`): When enabled, thread IDs are hashed before being included in observation context. This prevents the LLM from recognizing patterns in thread identifiers. Automatically enabled when using resource scope through the Memory class. (Default: `false`) ### Related - [Observational Memory](https://mastra.ai/docs/memory/observational-memory) - [Memory Overview](https://mastra.ai/docs/memory/overview) - [Memory Class](https://mastra.ai/reference/memory/memory-class) - [Memory Processors](https://mastra.ai/docs/memory/memory-processors) - [Processors](https://mastra.ai/docs/agents/processors)