createVectorQueryTool()
The createVectorQueryTool() function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends.
Basic Usage
import { openai } from "@ai-sdk/openai";
import { createVectorQueryTool } from "@mastra/rag";
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
});
Parameters
Parameter Requirements: Most fields can be set at creation as defaults.
Some fields can be overridden at runtime via the runtime context or input. If
a required field is missing from both creation and runtime, an error will be
thrown. Note that model, id, and description can only be set at creation
time.
id?:
description?:
model:
vectorStoreName:
indexName:
enableFilter?:
includeVectors?:
includeSources?:
reranker?:
databaseConfig?:
providerOptions?:
DatabaseConfig
The DatabaseConfig type allows you to specify database-specific configurations that are automatically applied to query operations. This enables you to take advantage of unique features and optimizations offered by different vector stores.
pinecone?:
namespace?:
sparseVector?:
pgvector?:
minScore?:
ef?:
probes?:
chroma?:
where?:
whereDocument?:
RerankConfig
model:
options?:
weights?:
topK?:
Returns
The tool returns an object with:
relevantContext:
sources:
QueryResult object structure
{
id: string; // Unique chunk/document identifier
metadata: any; // All metadata fields (document ID, etc.)
vector: number[]; // Embedding vector (if available)
score: number; // Similarity score for this retrieval
document: string; // Full chunk/document text (if available)
}
Default Tool Description
The default description focuses on:
- Finding relevant information in stored knowledge
- Answering user questions
- Retrieving factual content
Result Handling
The tool determines the number of results to return based on the user's query, with a default of 10 results. This can be adjusted based on the query requirements.
Example with Filters
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
enableFilter: true,
});
With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows:
- A user makes a query with specific filter requirements like "Find content where the 'version' field is greater than 2.0"
- The agent analyzes the query and constructs the appropriate filters:
{
"version": { "$gt": 2.0 }
}
This agent-driven approach:
- Processes natural language queries into filter specifications
- Implements vector store-specific filter syntax
- Translates query terms to filter operators
For detailed filter syntax and store-specific capabilities, see the Metadata Filters documentation.
For an example of how agent-driven filtering works, see the Agent-Driven Metadata Filtering example.
Example with Reranking
const queryTool = createVectorQueryTool({
vectorStoreName: "milvus",
indexName: "documentation",
model: openai.embedding("text-embedding-3-small"),
reranker: {
model: openai("gpt-4o-mini"),
options: {
weights: {
semantic: 0.5, // Semantic relevance weight
vector: 0.3, // Vector similarity weight
position: 0.2, // Original position weight
},
topK: 5,
},
},
});
Reranking improves result quality by combining:
- Semantic relevance: Using LLM-based scoring of text similarity
- Vector similarity: Original vector distance scores
- Position bias: Consideration of original result ordering
- Query analysis: Adjustments based on query characteristics
The reranker processes the initial vector search results and returns a reordered list optimized for relevance.
Example with Custom Description
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
description:
"Search through document archives to find relevant information for answering questions about company policies and procedures",
});
This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval.
Database-Specific Configuration Examples
The databaseConfig parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution.
- Pinecone
- pgVector
- Chroma
- Multiple Configs
Pinecone Configuration
const pineconeQueryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
databaseConfig: {
pinecone: {
namespace: "production", // Organize vectors by environment
sparseVector: { // Enable hybrid search
indices: [0, 1, 2, 3],
values: [0.1, 0.2, 0.15, 0.05]
}
}
}
});
Pinecone Features:
- Namespace: Isolate different data sets within the same index
- Sparse Vector: Combine dense and sparse embeddings for improved search quality
- Use Cases: Multi-tenant applications, hybrid semantic search
pgVector Configuration
const pgVectorQueryTool = createVectorQueryTool({
vectorStoreName: "postgres",
indexName: "embeddings",
model: openai.embedding("text-embedding-3-small"),
databaseConfig: {
pgvector: {
minScore: 0.7, // Only return results above 70% similarity
ef: 200, // Higher value = better accuracy, slower search
probes: 10 // For IVFFlat: more probes = better recall
}
}
});
pgVector Features:
- minScore: Filter out low-quality matches
- ef (HNSW): Control accuracy vs speed for HNSW indexes
- probes (IVFFlat): Control recall vs speed for IVFFlat indexes
- Use Cases: Performance tuning, quality filtering
Chroma Configuration
const chromaQueryTool = createVectorQueryTool({
vectorStoreName: "chroma",
indexName: "documents",
model: openai.embedding("text-embedding-3-small"),
databaseConfig: {
chroma: {
where: { // Metadata filtering
"category": "technical",
"status": "published"
},
whereDocument: { // Document content filtering
"$contains": "API"
}
}
}
});
Chroma Features:
- where: Filter by metadata fields
- whereDocument: Filter by document content
- Use Cases: Advanced filtering, content-based search
Multiple Database Configurations
// Configure for multiple databases (useful for dynamic stores)
const multiDbQueryTool = createVectorQueryTool({
vectorStoreName: "dynamic-store", // Will be set at runtime
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
databaseConfig: {
pinecone: {
namespace: "default"
},
pgvector: {
minScore: 0.8,
ef: 150
},
chroma: {
where: { "type": "documentation" }
}
}
});
Multi-Config Benefits:
- Support multiple vector stores with one tool
- Database-specific optimizations are automatically applied
- Flexible deployment scenarios
Runtime Configuration Override
You can override database configurations at runtime to adapt to different scenarios:
import { RuntimeContext } from "@mastra/core/runtime-context";
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
databaseConfig: {
pinecone: {
namespace: "development",
},
},
});
// Override at runtime
const runtimeContext = new RuntimeContext();
runtimeContext.set("databaseConfig", {
pinecone: {
namespace: "production", // Switch to production namespace
},
});
const response = await agent.generate("Find information about deployment", {
runtimeContext,
});
This approach allows you to:
- Switch between environments (dev/staging/prod)
- Adjust performance parameters based on load
- Apply different filtering strategies per request
Example: Using Runtime Context
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
});
When using runtime context, provide required parameters at execution time via the runtime context:
const runtimeContext = new RuntimeContext<{
vectorStoreName: string;
indexName: string;
topK: number;
filter: VectorFilter;
databaseConfig: DatabaseConfig;
}>();
runtimeContext.set("vectorStoreName", "my-store");
runtimeContext.set("indexName", "my-index");
runtimeContext.set("topK", 5);
runtimeContext.set("filter", { category: "docs" });
runtimeContext.set("databaseConfig", {
pinecone: { namespace: "runtime-namespace" },
});
runtimeContext.set("model", openai.embedding("text-embedding-3-small"));
const response = await agent.generate(
"Find documentation from the knowledge base.",
{
runtimeContext,
},
);
For more information on runtime context, please see:
Usage Without a Mastra Server
The tool can be used by itself to retrieve documents matching a query:
import { openai } from "@ai-sdk/openai";
import { RuntimeContext } from "@mastra/core/runtime-context";
import { createVectorQueryTool } from "@mastra/rag";
import { PgVector } from "@mastra/pg";
const pgVector = new PgVector({
connectionString: process.env.POSTGRES_CONNECTION_STRING!,
});
const vectorQueryTool = createVectorQueryTool({
vectorStoreName: "pgVector", // optional since we're passing in a store
vectorStore: pgVector,
indexName: "embeddings",
model: openai.embedding("text-embedding-3-small"),
});
const runtimeContext = new RuntimeContext();
const queryResult = await vectorQueryTool.execute({
context: { queryText: "foo", topK: 1 },
runtimeContext,
});
console.log(queryResult.sources);
Tool Details
The tool is created with:
- ID:
VectorQuery {vectorStoreName} {indexName} Tool - Input Schema: Requires queryText and filter objects
- Output Schema: Returns relevantContext string