createVectorQueryTool()
The createVectorQueryTool() function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends.
Basic UsageDirect link to Basic Usage
import { createVectorQueryTool } from '@mastra/rag'
import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
ParametersDirect link to Parameters
Parameter Requirements: Most fields can be set at creation as defaults.
Some fields can be overridden at runtime via the request context or input. If
a required field is missing from both creation and runtime, an error will be
thrown. Note that model, id, and description can only be set at creation
time.
id?:
description?:
model:
vectorStoreName:
indexName:
enableFilter?:
includeVectors?:
includeSources?:
reranker?:
databaseConfig?:
providerOptions?:
vectorStore?:
DatabaseConfigDirect link to DatabaseConfig
The DatabaseConfig type allows you to specify database-specific configurations that are automatically applied to query operations. This enables you to take advantage of unique features and optimizations offered by different vector stores.
pinecone?:
namespace?:
sparseVector?:
pgvector?:
minScore?:
ef?:
probes?:
chroma?:
where?:
whereDocument?:
RerankConfigDirect link to RerankConfig
model:
options?:
weights?:
topK?:
ReturnsDirect link to Returns
The tool returns an object with:
relevantContext:
sources:
QueryResult object structureDirect link to QueryResult object structure
{
id: string; // Unique chunk/document identifier
metadata: any; // All metadata fields (document ID, etc.)
vector: number[]; // Embedding vector (if available)
score: number; // Similarity score for this retrieval
document: string; // Full chunk/document text (if available)
}
Default Tool DescriptionDirect link to Default Tool Description
The default description focuses on:
- Finding relevant information in stored knowledge
- Answering user questions
- Retrieving factual content
Result HandlingDirect link to Result Handling
The tool determines the number of results to return based on the user's query, with a default of 10 results. This can be adjusted based on the query requirements.
Example with FiltersDirect link to Example with Filters
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
enableFilter: true,
})
With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows:
- A user makes a query with specific filter requirements like "Find content where the 'version' field is greater than 2.0"
- The agent analyzes the query and constructs the appropriate filters:
{
"version": { "$gt": 2.0 }
}
This agent-driven approach:
- Processes natural language queries into filter specifications
- Implements vector store-specific filter syntax
- Translates query terms to filter operators
For detailed filter syntax and store-specific capabilities, see the Metadata Filters documentation.
For an example of how agent-driven filtering works, see the Agent-Driven Metadata Filtering example.
Example with RerankingDirect link to Example with Reranking
const queryTool = createVectorQueryTool({
vectorStoreName: 'milvus',
indexName: 'documentation',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
reranker: {
model: 'openai/gpt-5.1',
options: {
weights: {
semantic: 0.5, // Semantic relevance weight
vector: 0.3, // Vector similarity weight
position: 0.2, // Original position weight
},
topK: 5,
},
},
})
Reranking improves result quality by combining:
- Semantic relevance: Using LLM-based scoring of text similarity
- Vector similarity: Original vector distance scores
- Position bias: Consideration of original result ordering
- Query analysis: Adjustments based on query characteristics
The reranker processes the initial vector search results and returns a reordered list optimized for relevance.
Example with Custom DescriptionDirect link to Example with Custom Description
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
description:
'Search through document archives to find relevant information for answering questions about company policies and procedures',
})
This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval.
Database-Specific Configuration ExamplesDirect link to Database-Specific Configuration Examples
The databaseConfig parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution.
- Pinecone
- pgVector
- Chroma
- Multiple Configs
Pinecone ConfigurationDirect link to Pinecone Configuration
const pineconeQueryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pinecone: {
namespace: 'production', // Organize vectors by environment
sparseVector: {
// Enable hybrid search
indices: [0, 1, 2, 3],
values: [0.1, 0.2, 0.15, 0.05],
},
},
},
})
Pinecone Features:
- Namespace: Isolate different data sets within the same index
- Sparse Vector: Combine dense and sparse embeddings for improved search quality
- Use Cases: Multi-tenant applications, hybrid semantic search
pgVector ConfigurationDirect link to pgVector Configuration
const pgVectorQueryTool = createVectorQueryTool({
vectorStoreName: 'postgres',
indexName: 'embeddings',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pgvector: {
minScore: 0.7, // Only return results above 70% similarity
ef: 200, // Higher value = better accuracy, slower search
probes: 10, // For IVFFlat: more probes = better recall
},
},
})
pgVector Features:
- minScore: Filter out low-quality matches
- ef (HNSW): Control accuracy vs speed for HNSW indexes
- probes (IVFFlat): Control recall vs speed for IVFFlat indexes
- Use Cases: Performance tuning, quality filtering
Chroma ConfigurationDirect link to Chroma Configuration
const chromaQueryTool = createVectorQueryTool({
vectorStoreName: 'chroma',
indexName: 'documents',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
chroma: {
where: {
// Metadata filtering
category: 'technical',
status: 'published',
},
whereDocument: {
// Document content filtering
$contains: 'API',
},
},
},
})
Chroma Features:
- where: Filter by metadata fields
- whereDocument: Filter by document content
- Use Cases: Advanced filtering, content-based search
Multiple Database ConfigurationsDirect link to Multiple Database Configurations
// Configure for multiple databases (useful for dynamic stores)
const multiDbQueryTool = createVectorQueryTool({
vectorStoreName: 'dynamic-store', // Will be set at runtime
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pinecone: {
namespace: 'default',
},
pgvector: {
minScore: 0.8,
ef: 150,
},
chroma: {
where: { type: 'documentation' },
},
},
})
Multi-Config Benefits:
- Support multiple vector stores with one tool
- Database-specific optimizations are automatically applied
- Flexible deployment scenarios
Runtime Configuration OverrideDirect link to Runtime Configuration Override
You can override database configurations at runtime to adapt to different scenarios:
import { RequestContext } from '@mastra/core/request-context'
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
databaseConfig: {
pinecone: {
namespace: 'development',
},
},
})
// Override at runtime
const requestContext = new RequestContext()
requestContext.set('databaseConfig', {
pinecone: {
namespace: 'production', // Switch to production namespace
},
})
const response = await agent.generate('Find information about deployment', {
requestContext,
})
This approach allows you to:
- Switch between environments (dev/staging/prod)
- Adjust performance parameters based on load
- Apply different filtering strategies per request
Example: Using Request ContextDirect link to Example: Using Request Context
const queryTool = createVectorQueryTool({
vectorStoreName: 'pinecone',
indexName: 'docs',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
When using request context, provide required parameters at execution time via the request context:
const requestContext = new RequestContext<{
vectorStoreName: string
indexName: string
topK: number
filter: VectorFilter
databaseConfig: DatabaseConfig
}>()
requestContext.set('vectorStoreName', 'my-store')
requestContext.set('indexName', 'my-index')
requestContext.set('topK', 5)
requestContext.set('filter', { category: 'docs' })
requestContext.set('databaseConfig', {
pinecone: { namespace: 'runtime-namespace' },
})
requestContext.set('model', 'openai/text-embedding-3-small')
const response = await agent.generate('Find documentation from the knowledge base.', {
requestContext,
})
For more information on request context, please see:
Usage Without a Mastra ServerDirect link to Usage Without a Mastra Server
The tool can be used by itself to retrieve documents matching a query:
import { RequestContext } from '@mastra/core/request-context'
import { createVectorQueryTool } from '@mastra/rag'
import { PgVector } from '@mastra/pg'
const pgVector = new PgVector({
id: 'pg-vector',
connectionString: process.env.POSTGRES_CONNECTION_STRING!,
})
const vectorQueryTool = createVectorQueryTool({
vectorStoreName: 'pgVector', // optional since we're passing in a store
vectorStore: pgVector,
indexName: 'embeddings',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
const requestContext = new RequestContext()
const queryResult = await vectorQueryTool.execute({ queryText: 'foo', topK: 1 }, { requestContext })
console.log(queryResult.sources)
Dynamic Vector Store for Multi-Tenant ApplicationsDirect link to Dynamic Vector Store for Multi-Tenant Applications
For multi-tenant applications where each tenant has isolated data (e.g., separate PostgreSQL schemas), you can pass a resolver function instead of a static vector store instance. The function receives the request context and can return the appropriate vector store for the current tenant:
import { createVectorQueryTool, VectorStoreResolver } from '@mastra/rag'
import { PgVector } from '@mastra/pg'
// Cache for tenant-specific vector stores
const vectorStoreCache = new Map<string, PgVector>()
// Resolver function that returns the correct vector store based on tenant
const vectorStoreResolver: VectorStoreResolver = async ({ requestContext }) => {
const tenantId = requestContext?.get('tenantId')
if (!tenantId) {
throw new Error('tenantId is required in request context')
}
// Return cached instance or create new one
if (!vectorStoreCache.has(tenantId)) {
vectorStoreCache.set(
tenantId,
new PgVector({
id: `pg-vector-${tenantId}`,
connectionString: process.env.POSTGRES_CONNECTION_STRING!,
schemaName: `tenant_${tenantId}`, // Each tenant has their own schema
}),
)
}
return vectorStoreCache.get(tenantId)!
}
const vectorQueryTool = createVectorQueryTool({
indexName: 'embeddings',
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
vectorStore: vectorStoreResolver, // Dynamic resolution!
})
// Usage with tenant context
const requestContext = new RequestContext()
requestContext.set('tenantId', 'acme-corp')
const result = await vectorQueryTool.execute(
{ queryText: 'company policies', topK: 5 },
{ requestContext },
)
This pattern is similar to how Agent.memory supports dynamic configuration and enables:
- Schema isolation: Each tenant's data in separate PostgreSQL schemas
- Database isolation: Route to different database instances per tenant
- Dynamic configuration: Adjust vector store settings based on request context
Tool DetailsDirect link to Tool Details
The tool is created with:
- ID:
VectorQuery {vectorStoreName} {indexName} Tool - Input Schema: Requires queryText and filter objects
- Output Schema: Returns relevantContext string