createVectorQueryTool()
The createVectorQueryTool()
function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends.
Basic Usage
import { openai } from "@ai-sdk/openai";
import { createVectorQueryTool } from "@mastra/rag";
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
});
Parameters
Parameter Requirements: Most fields can be set at creation as defaults.
Some fields can be overridden at runtime via the runtime context or input. If
a required field is missing from both creation and runtime, an error will be
thrown. Note that model
, id
, and description
can only be set at creation
time.
id?:
description?:
model:
vectorStoreName:
indexName:
enableFilter?:
includeVectors?:
includeSources?:
reranker?:
databaseConfig?:
DatabaseConfig
The DatabaseConfig
type allows you to specify database-specific configurations that are automatically applied to query operations. This enables you to take advantage of unique features and optimizations offered by different vector stores.
pinecone?:
namespace?:
sparseVector?:
pgvector?:
minScore?:
ef?:
probes?:
chroma?:
where?:
whereDocument?:
RerankConfig
model:
options?:
weights?:
topK?:
Returns
The tool returns an object with:
relevantContext:
sources:
QueryResult object structure
{
id: string; // Unique chunk/document identifier
metadata: any; // All metadata fields (document ID, etc.)
vector: number[]; // Embedding vector (if available)
score: number; // Similarity score for this retrieval
document: string; // Full chunk/document text (if available)
}
Default Tool Description
The default description focuses on:
- Finding relevant information in stored knowledge
- Answering user questions
- Retrieving factual content
Result Handling
The tool determines the number of results to return based on the user’s query, with a default of 10 results. This can be adjusted based on the query requirements.
Example with Filters
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
enableFilter: true,
});
With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows:
- A user makes a query with specific filter requirements like “Find content where the ‘version’ field is greater than 2.0”
- The agent analyzes the query and constructs the appropriate filters:
{ "version": { "$gt": 2.0 } }
This agent-driven approach:
- Processes natural language queries into filter specifications
- Implements vector store-specific filter syntax
- Translates query terms to filter operators
For detailed filter syntax and store-specific capabilities, see the Metadata Filters documentation.
For an example of how agent-driven filtering works, see the Agent-Driven Metadata Filtering example.
Example with Reranking
const queryTool = createVectorQueryTool({
vectorStoreName: "milvus",
indexName: "documentation",
model: openai.embedding("text-embedding-3-small"),
reranker: {
model: openai("gpt-4o-mini"),
options: {
weights: {
semantic: 0.5, // Semantic relevance weight
vector: 0.3, // Vector similarity weight
position: 0.2, // Original position weight
},
topK: 5,
},
},
});
Reranking improves result quality by combining:
- Semantic relevance: Using LLM-based scoring of text similarity
- Vector similarity: Original vector distance scores
- Position bias: Consideration of original result ordering
- Query analysis: Adjustments based on query characteristics
The reranker processes the initial vector search results and returns a reordered list optimized for relevance.
Example with Custom Description
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
description:
"Search through document archives to find relevant information for answering questions about company policies and procedures",
});
This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval.
Database-Specific Configuration Examples
The databaseConfig
parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution.
Pinecone
Pinecone Configuration
const pineconeQueryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
databaseConfig: {
pinecone: {
namespace: "production", // Organize vectors by environment
sparseVector: { // Enable hybrid search
indices: [0, 1, 2, 3],
values: [0.1, 0.2, 0.15, 0.05]
}
}
}
});
Pinecone Features:
- Namespace: Isolate different data sets within the same index
- Sparse Vector: Combine dense and sparse embeddings for improved search quality
- Use Cases: Multi-tenant applications, hybrid semantic search
Runtime Configuration Override
You can override database configurations at runtime to adapt to different scenarios:
import { RuntimeContext } from '@mastra/core/runtime-context';
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
databaseConfig: {
pinecone: {
namespace: "development"
}
}
});
// Override at runtime
const runtimeContext = new RuntimeContext();
runtimeContext.set('databaseConfig', {
pinecone: {
namespace: 'production' // Switch to production namespace
}
});
const response = await agent.generate(
"Find information about deployment",
{ runtimeContext }
);
This approach allows you to:
- Switch between environments (dev/staging/prod)
- Adjust performance parameters based on load
- Apply different filtering strategies per request
Example: Using Runtime Context
const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: openai.embedding("text-embedding-3-small"),
});
When using runtime context, provide required parameters at execution time via the runtime context:
const runtimeContext = new RuntimeContext<{
vectorStoreName: string;
indexName: string;
topK: number;
filter: VectorFilter;
databaseConfig: DatabaseConfig;
}>();
runtimeContext.set("vectorStoreName", "my-store");
runtimeContext.set("indexName", "my-index");
runtimeContext.set("topK", 5);
runtimeContext.set("filter", { category: "docs" });
runtimeContext.set("databaseConfig", {
pinecone: { namespace: "runtime-namespace" }
});
const response = await agent.generate(
"Find documentation from the knowledge base.",
{
runtimeContext,
},
);
For more information on runtime context, please see:
Tool Details
The tool is created with:
- ID:
VectorQuery {vectorStoreName} {indexName} Tool
- Input Schema: Requires queryText and filter objects
- Output Schema: Returns relevantContext string