Skip to Content
ReferenceToolscreateVectorQueryTool()

createVectorQueryTool()

The createVectorQueryTool() function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends.

Basic Usage

import { openai } from "@ai-sdk/openai"; import { createVectorQueryTool } from "@mastra/rag"; const queryTool = createVectorQueryTool({ vectorStoreName: "pinecone", indexName: "docs", model: openai.embedding("text-embedding-3-small"), });

Parameters

💡

Parameter Requirements: Most fields can be set at creation as defaults. Some fields can be overridden at runtime via the runtime context or input. If a required field is missing from both creation and runtime, an error will be thrown. Note that model, id, and description can only be set at creation time.

id?:

string
Custom ID for the tool. By default: 'VectorQuery {vectorStoreName} {indexName} Tool'. (Set at creation only.)

description?:

string
Custom description for the tool. By default: 'Access the knowledge base to find information needed to answer user questions' (Set at creation only.)

model:

EmbeddingModel
Embedding model to use for vector search. (Set at creation only.)

vectorStoreName:

string
Name of the vector store to query. (Can be set at creation or overridden at runtime.)

indexName:

string
Name of the index within the vector store. (Can be set at creation or overridden at runtime.)

enableFilter?:

boolean
= false
Enable filtering of results based on metadata. (Set at creation only, but will be automatically enabled if a filter is provided in the runtime context.)

includeVectors?:

boolean
= false
Include the embedding vectors in the results. (Can be set at creation or overridden at runtime.)

includeSources?:

boolean
= true
Include the full retrieval objects in the results. (Can be set at creation or overridden at runtime.)

reranker?:

RerankConfig
Options for reranking results. (Can be set at creation or overridden at runtime.)

databaseConfig?:

DatabaseConfig
Database-specific configuration options for optimizing queries. (Can be set at creation or overridden at runtime.)

DatabaseConfig

The DatabaseConfig type allows you to specify database-specific configurations that are automatically applied to query operations. This enables you to take advantage of unique features and optimizations offered by different vector stores.

pinecone?:

PineconeConfig
Configuration specific to Pinecone vector store
object

namespace?:

string
Pinecone namespace for organizing vectors

sparseVector?:

{ indices: number[]; values: number[]; }
Sparse vector for hybrid search

pgvector?:

PgVectorConfig
Configuration specific to PostgreSQL with pgvector extension
object

minScore?:

number
Minimum similarity score threshold for results

ef?:

number
HNSW search parameter - controls accuracy vs speed tradeoff

probes?:

number
IVFFlat probe parameter - number of cells to visit during search

chroma?:

ChromaConfig
Configuration specific to Chroma vector store
object

where?:

Record<string, any>
Metadata filtering conditions

whereDocument?:

Record<string, any>
Document content filtering conditions

RerankConfig

model:

MastraLanguageModel
Language model to use for reranking

options?:

RerankerOptions
Options for the reranking process
object

weights?:

WeightConfig
Weights for scoring components (semantic: 0.4, vector: 0.4, position: 0.2)

topK?:

number
Number of top results to return

Returns

The tool returns an object with:

relevantContext:

string
Combined text from the most relevant document chunks

sources:

QueryResult[]
Array of full retrieval result objects. Each object contains all information needed to reference the original document, chunk, and similarity score.

QueryResult object structure

{ id: string; // Unique chunk/document identifier metadata: any; // All metadata fields (document ID, etc.) vector: number[]; // Embedding vector (if available) score: number; // Similarity score for this retrieval document: string; // Full chunk/document text (if available) }

Default Tool Description

The default description focuses on:

  • Finding relevant information in stored knowledge
  • Answering user questions
  • Retrieving factual content

Result Handling

The tool determines the number of results to return based on the user’s query, with a default of 10 results. This can be adjusted based on the query requirements.

Example with Filters

const queryTool = createVectorQueryTool({ vectorStoreName: "pinecone", indexName: "docs", model: openai.embedding("text-embedding-3-small"), enableFilter: true, });

With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows:

  1. A user makes a query with specific filter requirements like “Find content where the ‘version’ field is greater than 2.0”
  2. The agent analyzes the query and constructs the appropriate filters:
    { "version": { "$gt": 2.0 } }

This agent-driven approach:

  • Processes natural language queries into filter specifications
  • Implements vector store-specific filter syntax
  • Translates query terms to filter operators

For detailed filter syntax and store-specific capabilities, see the Metadata Filters documentation.

For an example of how agent-driven filtering works, see the Agent-Driven Metadata Filtering example.

Example with Reranking

const queryTool = createVectorQueryTool({ vectorStoreName: "milvus", indexName: "documentation", model: openai.embedding("text-embedding-3-small"), reranker: { model: openai("gpt-4o-mini"), options: { weights: { semantic: 0.5, // Semantic relevance weight vector: 0.3, // Vector similarity weight position: 0.2, // Original position weight }, topK: 5, }, }, });

Reranking improves result quality by combining:

  • Semantic relevance: Using LLM-based scoring of text similarity
  • Vector similarity: Original vector distance scores
  • Position bias: Consideration of original result ordering
  • Query analysis: Adjustments based on query characteristics

The reranker processes the initial vector search results and returns a reordered list optimized for relevance.

Example with Custom Description

const queryTool = createVectorQueryTool({ vectorStoreName: "pinecone", indexName: "docs", model: openai.embedding("text-embedding-3-small"), description: "Search through document archives to find relevant information for answering questions about company policies and procedures", });

This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval.

Database-Specific Configuration Examples

The databaseConfig parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution.

Pinecone Configuration

const pineconeQueryTool = createVectorQueryTool({ vectorStoreName: "pinecone", indexName: "docs", model: openai.embedding("text-embedding-3-small"), databaseConfig: { pinecone: { namespace: "production", // Organize vectors by environment sparseVector: { // Enable hybrid search indices: [0, 1, 2, 3], values: [0.1, 0.2, 0.15, 0.05] } } } });

Pinecone Features:

  • Namespace: Isolate different data sets within the same index
  • Sparse Vector: Combine dense and sparse embeddings for improved search quality
  • Use Cases: Multi-tenant applications, hybrid semantic search

Runtime Configuration Override

You can override database configurations at runtime to adapt to different scenarios:

import { RuntimeContext } from '@mastra/core/runtime-context'; const queryTool = createVectorQueryTool({ vectorStoreName: "pinecone", indexName: "docs", model: openai.embedding("text-embedding-3-small"), databaseConfig: { pinecone: { namespace: "development" } } }); // Override at runtime const runtimeContext = new RuntimeContext(); runtimeContext.set('databaseConfig', { pinecone: { namespace: 'production' // Switch to production namespace } }); const response = await agent.generate( "Find information about deployment", { runtimeContext } );

This approach allows you to:

  • Switch between environments (dev/staging/prod)
  • Adjust performance parameters based on load
  • Apply different filtering strategies per request

Example: Using Runtime Context

const queryTool = createVectorQueryTool({ vectorStoreName: "pinecone", indexName: "docs", model: openai.embedding("text-embedding-3-small"), });

When using runtime context, provide required parameters at execution time via the runtime context:

const runtimeContext = new RuntimeContext<{ vectorStoreName: string; indexName: string; topK: number; filter: VectorFilter; databaseConfig: DatabaseConfig; }>(); runtimeContext.set("vectorStoreName", "my-store"); runtimeContext.set("indexName", "my-index"); runtimeContext.set("topK", 5); runtimeContext.set("filter", { category: "docs" }); runtimeContext.set("databaseConfig", { pinecone: { namespace: "runtime-namespace" } }); const response = await agent.generate( "Find documentation from the knowledge base.", { runtimeContext, }, );

For more information on runtime context, please see:

Tool Details

The tool is created with:

  • ID: VectorQuery {vectorStoreName} {indexName} Tool
  • Input Schema: Requires queryText and filter objects
  • Output Schema: Returns relevantContext string