SemanticRecall
The SemanticRecall is a hybrid processor that enables semantic search over conversation history using vector embeddings. On input, it performs semantic search to find relevant historical messages. On output, it creates embeddings for new messages to enable future semantic retrieval.
Usage exampleDirect link to Usage example
import { SemanticRecall } from "@mastra/core/processors";
import { openai } from "@ai-sdk/openai";
const processor = new SemanticRecall({
storage: memoryStorage,
vector: vectorStore,
embedder: openai.embedding("text-embedding-3-small"),
topK: 5,
messageRange: 2,
scope: "resource",
});
Constructor parametersDirect link to Constructor parameters
options:
SemanticRecallOptions
Configuration options for the semantic recall processor
OptionsDirect link to Options
storage:
MemoryStorage
Storage instance for retrieving messages
vector:
MastraVector
Vector store for semantic search
embedder:
MastraEmbeddingModel<string>
Embedder for generating query embeddings
topK?:
number
Number of most similar messages to retrieve
messageRange?:
number | { before: number; after: number }
Number of context messages to include before/after each match. Can be a single number (same for both) or an object with separate values
scope?:
'thread' | 'resource'
Scope of semantic search. 'thread' searches within the current thread only. 'resource' searches across all threads for the resource
threshold?:
number
Minimum similarity score threshold (0-1). Messages below this threshold are filtered out
indexName?:
string
Index name for the vector store. If not provided, auto-generated based on embedder model
logger?:
IMastraLogger
Optional logger instance for structured logging
ReturnsDirect link to Returns
id:
string
Processor identifier set to 'semantic-recall'
name:
string
Processor display name set to 'SemanticRecall'
processInput:
(args: { messages: MastraDBMessage[]; messageList: MessageList; abort: (reason?: string) => never; tracingContext?: TracingContext; requestContext?: RequestContext }) => Promise<MessageList | MastraDBMessage[]>
Performs semantic search on historical messages and adds relevant context to the message list
processOutputResult:
(args: { messages: MastraDBMessage[]; messageList?: MessageList; abort: (reason?: string) => never; tracingContext?: TracingContext; requestContext?: RequestContext }) => Promise<MessageList | MastraDBMessage[]>
Creates embeddings for new messages to enable future semantic search
Extended usage exampleDirect link to Extended usage example
src/mastra/agents/semantic-memory-agent.ts
import { Agent } from "@mastra/core/agent";
import { SemanticRecall, MessageHistory } from "@mastra/core/processors";
import { PostgresStorage } from "@mastra/pg";
import { PgVector } from "@mastra/pg";
import { openai } from "@ai-sdk/openai";
const storage = new PostgresStorage({
id: 'pg-storage',
connectionString: process.env.DATABASE_URL,
});
const vector = new PgVector({
id: 'pg-vector',
connectionString: process.env.DATABASE_URL,
});
const semanticRecall = new SemanticRecall({
storage,
vector,
embedder: openai.embedding("text-embedding-3-small"),
topK: 5,
messageRange: { before: 2, after: 1 },
scope: "resource",
threshold: 0.7,
});
export const agent = new Agent({
name: "semantic-memory-agent",
instructions: "You are a helpful assistant with semantic memory recall",
model: "openai:gpt-4o",
inputProcessors: [
semanticRecall,
new MessageHistory({ storage, lastMessages: 50 }),
],
outputProcessors: [
semanticRecall,
new MessageHistory({ storage }),
],
});
BehaviorDirect link to Behavior
Input processingDirect link to Input processing
- Extracts the user query from the last user message
- Generates embeddings for the query
- Performs vector search to find semantically similar messages
- Retrieves matched messages along with surrounding context (based on
messageRange) - For
scope: 'resource', formats cross-thread messages as a system message with timestamps - Adds recalled messages with
source: 'memory'tag
Output processingDirect link to Output processing
- Extracts text content from new user and assistant messages
- Generates embeddings for each message
- Stores embeddings in the vector store with metadata (message ID, thread ID, resource ID, role, content, timestamp)
- Uses LRU caching for embeddings to avoid redundant API calls
Cross-thread recallDirect link to Cross-thread recall
When scope is set to 'resource', the processor can recall messages from other threads. These cross-thread messages are formatted as a system message with timestamps and conversation labels to provide context about when and where the conversation occurred.