SemanticRecall

The SemanticRecall is a hybrid processor that enables semantic search over conversation history using vector embeddings. On input, it performs semantic search to find relevant historical messages. On output, it creates embeddings for new messages to enable future semantic retrieval.

Usage example
Direct link to Usage example

import { SemanticRecall } from '@mastra/core/processors'
import { openai } from '@ai-sdk/openai'

const processor = new SemanticRecall({
  storage: memoryStorage,
  vector: vectorStore,
  embedder: openai.embedding('text-embedding-3-small'),
  topK: 5,
  messageRange: 2,
  scope: 'resource',
})

Constructor parameters
Direct link to Constructor parameters

options:

SemanticRecallOptions

Configuration options for the semantic recall processor

SemanticRecallOptions

storage:

MemoryStorage

Storage instance for retrieving messages

vector:

MastraVector

Vector store for semantic search

embedder:

MastraEmbeddingModel<string>

Embedder for generating query embeddings

topK?:

number

Number of most similar messages to retrieve

messageRange?:

number | { before: number; after: number }

Number of context messages to include before/after each match. Can be a single number (same for both) or an object with separate values

scope?:

'thread' | 'resource'

Scope of semantic search. 'thread' searches within the current thread only. 'resource' searches across all threads for the resource

threshold?:

number

Minimum similarity score threshold (0-1). Messages below this threshold are filtered out

indexName?:

string

Index name for the vector store. If not provided, auto-generated based on embedder model

logger?:

IMastraLogger

Optional logger instance for structured logging

Returns
Direct link to Returns

id:

string

Processor identifier set to 'semantic-recall'

name:

string

Processor display name set to 'SemanticRecall'

processInput:

(args: { messages: MastraDBMessage[]; messageList: MessageList; abort: (reason?: string) => never; tracingContext?: TracingContext; requestContext?: RequestContext }) => Promise<MessageList | MastraDBMessage[]>

Performs semantic search on historical messages and adds relevant context to the message list

processOutputResult:

(args: { messages: MastraDBMessage[]; messageList?: MessageList; abort: (reason?: string) => never; tracingContext?: TracingContext; requestContext?: RequestContext }) => Promise<MessageList | MastraDBMessage[]>

Creates embeddings for new messages to enable future semantic search

Extended usage example
Direct link to Extended usage example

src/mastra/agents/semantic-memory-agent.ts
import { Agent } from '@mastra/core/agent'
import { SemanticRecall, MessageHistory } from '@mastra/core/processors'
import { PostgresStorage } from '@mastra/pg'
import { PgVector } from '@mastra/pg'
import { openai } from '@ai-sdk/openai'

const storage = new PostgresStorage({
  id: 'pg-storage',
  connectionString: process.env.DATABASE_URL,
})

const vector = new PgVector({
  id: 'pg-vector',
  connectionString: process.env.DATABASE_URL,
})

const semanticRecall = new SemanticRecall({
  storage,
  vector,
  embedder: openai.embedding('text-embedding-3-small'),
  topK: 5,
  messageRange: { before: 2, after: 1 },
  scope: 'resource',
  threshold: 0.7,
})

export const agent = new Agent({
  name: 'semantic-memory-agent',
  instructions: 'You are a helpful assistant with semantic memory recall',
  model: 'openai:gpt-4o',
  inputProcessors: [semanticRecall, new MessageHistory({ storage, lastMessages: 50 })],
  outputProcessors: [semanticRecall, new MessageHistory({ storage })],
})

Behavior
Direct link to Behavior

Input processing
Direct link to Input processing

Extracts the user query from the last user message
Generates embeddings for the query
Performs vector search to find semantically similar messages
Retrieves matched messages along with surrounding context (based on messageRange)
For scope: 'resource', formats cross-thread messages as a system message with timestamps
Adds recalled messages with source: 'memory' tag

Output processing
Direct link to Output processing

Extracts text content from new user and assistant messages
Generates embeddings for each message
Stores embeddings in the vector store with metadata (message ID, thread ID, resource ID, role, content, timestamp)
Uses LRU caching for embeddings to avoid redundant API calls

Cross-thread recall
Direct link to Cross-thread recall

When scope is set to 'resource', the processor can recall messages from other threads. These cross-thread messages are formatted as a system message with timestamps and conversation labels to provide context about when and where the conversation occurred.

Guardrails

Usage exampleDirect link to Usage example

Constructor parametersDirect link to Constructor parameters

options:

storage:

vector:

embedder:

topK?:

messageRange?:

scope?:

threshold?:

indexName?:

logger?:

ReturnsDirect link to Returns

id:

name:

processInput:

processOutputResult:

Extended usage exampleDirect link to Extended usage example

BehaviorDirect link to Behavior

Input processingDirect link to Input processing

Output processingDirect link to Output processing

Cross-thread recallDirect link to Cross-thread recall

RelatedDirect link to Related

Usage example
Direct link to Usage example

Constructor parameters
Direct link to Constructor parameters

Returns
Direct link to Returns

Extended usage example
Direct link to Extended usage example

Behavior
Direct link to Behavior

Input processing
Direct link to Input processing

Output processing
Direct link to Output processing

Cross-thread recall
Direct link to Cross-thread recall

Related
Direct link to Related