createVectorQueryTool()

The createVectorQueryTool() function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends.

Basic usage
Direct link to Basic usage

import { createVectorQueryTool } from '@mastra/rag'
import { ModelRouterEmbeddingModel } from '@mastra/core/llm'

const queryTool = createVectorQueryTool({
  vectorStoreName: 'pinecone',
  indexName: 'docs',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})

Parameters
Direct link to Parameters

note

Parameter Requirements: Most fields can be set at creation as defaults. Some fields can be overridden at runtime via the request context or input. If a required field is missing from both creation and runtime, an error will be thrown. Note that model, id, and description can only be set at creation time.

id?:

string

Custom ID for the tool. By default: 'VectorQuery {vectorStoreName} {indexName} Tool'. (Set at creation only.)

description?:

string

Custom description for the tool. By default: 'Access the knowledge base to find information needed to answer user questions' (Set at creation only.)

model:

EmbeddingModel

Embedding model to use for vector search. (Set at creation only.)

vectorStoreName:

string

Name of the vector store to query. (Can be set at creation or overridden at runtime.)

indexName:

string

Name of the index within the vector store. (Can be set at creation or overridden at runtime.)

enableFilter?:

boolean

= false

Enable filtering of results based on metadata. (Set at creation only, but will be automatically enabled if a filter is provided in the request context.)

includeVectors?:

boolean

= false

Include the embedding vectors in the results. (Can be set at creation or overridden at runtime.)

includeSources?:

boolean

= true

Include the full retrieval objects in the results. (Can be set at creation or overridden at runtime.)

reranker?:

RerankConfig

Options for reranking results. (Can be set at creation or overridden at runtime.)

RerankConfig

model:

MastraLanguageModel

Language model to use for reranking

options?:

RerankerOptions

Options for the reranking process

RerankerOptions

weights?:

WeightConfig

Weights for scoring components (semantic: 0.4, vector: 0.4, position: 0.2)

topK?:

number

Number of top results to return

databaseConfig?:

DatabaseConfig

Database-specific configuration options for optimizing queries. (Can be set at creation or overridden at runtime.)

DatabaseConfig

pinecone?:

PineconeConfig

Configuration specific to Pinecone vector store

PineconeConfig

namespace?:

string

Pinecone namespace for organizing vectors

sparseVector?:

{ indices: number[]; values: number[]; }

Sparse vector for hybrid search

pgvector?:

PgVectorConfig

Configuration specific to PostgreSQL with pgvector extension

PgVectorConfig

minScore?:

number

Minimum similarity score threshold for results

ef?:

number

HNSW search parameter - controls accuracy vs speed tradeoff

probes?:

number

IVFFlat probe parameter - number of cells to visit during search

chroma?:

ChromaConfig

Configuration specific to Chroma vector store

ChromaConfig

where?:

Record<string, any>

Metadata filtering conditions

whereDocument?:

Record<string, any>

Document content filtering conditions

providerOptions?:

Record<string, Record<string, any>>

Provider-specific options for the embedding model (e.g., outputDimensionality). **Important**: Only works with AI SDK EmbeddingModelV2 models. For V1 models, configure options when creating the model itself.

vectorStore?:

MastraVector | VectorStoreResolver

Direct vector store instance or a resolver function for dynamic selection. Use a function for multi-tenant applications where the vector store is selected based on request context. When provided, `vectorStoreName` becomes optional.

Returns
Direct link to Returns

The tool returns an object with:

relevantContext:

string

Combined text from the most relevant document chunks

sources:

QueryResult[]

Array of full retrieval result objects. Each object contains all information needed to reference the original document, chunk, and similarity score.

`QueryResult` object structure
Direct link to queryresult-object-structure

{
  id: string;         // Unique chunk/document identifier
  metadata: any;      // All metadata fields (document ID, etc.)
  vector: number[];   // Embedding vector (if available)
  score: number;      // Similarity score for this retrieval
  document: string;   // Full chunk/document text (if available)
}

Default tool description
Direct link to Default tool description

The default description focuses on:

Finding relevant information in stored knowledge
Answering user questions
Retrieving factual content

Result handling
Direct link to Result handling

The tool determines the number of results to return based on the user's query, with a default of 10 results. This can be adjusted based on the query requirements.

Example with filters
Direct link to Example with filters

const queryTool = createVectorQueryTool({
  vectorStoreName: 'pinecone',
  indexName: 'docs',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  enableFilter: true,
})

With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows:

A user makes a query with specific filter requirements like "Find content where the 'version' field is greater than 2.0"
The agent analyzes the query and constructs the appropriate filters:
```
{
   "version": { "$gt": 2.0 }
}
```

This agent-driven approach:

Processes natural language queries into filter specifications
Implements vector store-specific filter syntax
Translates query terms to filter operators

For detailed filter syntax and store-specific capabilities, see the Metadata Filters documentation.

For an example of how agent-driven filtering works, see the Agent-Driven Metadata Filtering example.

Example with reranking
Direct link to Example with reranking

const queryTool = createVectorQueryTool({
  vectorStoreName: 'milvus',
  indexName: 'documentation',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  reranker: {
    model: 'openai/gpt-5.1',
    options: {
      weights: {
        semantic: 0.5, // Semantic relevance weight
        vector: 0.3, // Vector similarity weight
        position: 0.2, // Original position weight
      },
      topK: 5,
    },
  },
})

Reranking improves result quality by combining:

Semantic relevance: Using LLM-based scoring of text similarity
Vector similarity: Original vector distance scores
Position bias: Consideration of original result ordering
Query analysis: Adjustments based on query characteristics

The reranker processes the initial vector search results and returns a reordered list optimized for relevance.

Example with custom description
Direct link to Example with custom description

const queryTool = createVectorQueryTool({
  vectorStoreName: 'pinecone',
  indexName: 'docs',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  description:
    'Search through document archives to find relevant information for answering questions about company policies and procedures',
})

This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval.

Database-specific configuration examples
Direct link to Database-specific configuration examples

The databaseConfig parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution.

Pinecone
pgVector
Chroma
Multiple Configs

Pinecone Configuration
Direct link to Pinecone Configuration

const pineconeQueryTool = createVectorQueryTool({
  vectorStoreName: 'pinecone',
  indexName: 'docs',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  databaseConfig: {
    pinecone: {
      namespace: 'production', // Organize vectors by environment
      sparseVector: {
        // Enable hybrid search
        indices: [0, 1, 2, 3],
        values: [0.1, 0.2, 0.15, 0.05],
      },
    },
  },
})

Pinecone Features:

Namespace: Isolate different data sets within the same index
Sparse Vector: Combine dense and sparse embeddings for improved search quality
Use Cases: Multi-tenant applications, hybrid semantic search

pgVector Configuration
Direct link to pgVector Configuration

const pgVectorQueryTool = createVectorQueryTool({
  vectorStoreName: 'postgres',
  indexName: 'embeddings',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  databaseConfig: {
    pgvector: {
      minScore: 0.7, // Only return results above 70% similarity
      ef: 200, // Higher value = better accuracy, slower search
      probes: 10, // For IVFFlat: more probes = better recall
    },
  },
})

pgVector Features:

minScore: Filter out low-quality matches
ef (HNSW): Control accuracy vs speed for HNSW indexes
probes (IVFFlat): Control recall vs speed for IVFFlat indexes
Use Cases: Performance tuning, quality filtering

Chroma Configuration
Direct link to Chroma Configuration

const chromaQueryTool = createVectorQueryTool({
  vectorStoreName: 'chroma',
  indexName: 'documents',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  databaseConfig: {
    chroma: {
      where: {
        // Metadata filtering
        category: 'technical',
        status: 'published',
      },
      whereDocument: {
        // Document content filtering
        $contains: 'API',
      },
    },
  },
})

Chroma Features:

where: Filter by metadata fields
whereDocument: Filter by document content
Use Cases: Advanced filtering, content-based search

Multiple Database Configurations
Direct link to Multiple Database Configurations

// Configure for multiple databases (useful for dynamic stores)
const multiDbQueryTool = createVectorQueryTool({
  vectorStoreName: 'dynamic-store', // Will be set at runtime
  indexName: 'docs',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  databaseConfig: {
    pinecone: {
      namespace: 'default',
    },
    pgvector: {
      minScore: 0.8,
      ef: 150,
    },
    chroma: {
      where: { type: 'documentation' },
    },
  },
})

Multi-Config Benefits:

Support multiple vector stores with one tool
Database-specific optimizations are automatically applied
Flexible deployment scenarios

Runtime Configuration Override
Direct link to Runtime Configuration Override

You can override database configurations at runtime to adapt to different scenarios:

import { RequestContext } from '@mastra/core/request-context'

const queryTool = createVectorQueryTool({
  vectorStoreName: 'pinecone',
  indexName: 'docs',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  databaseConfig: {
    pinecone: {
      namespace: 'development',
    },
  },
})

// Override at runtime
const requestContext = new RequestContext()
requestContext.set('databaseConfig', {
  pinecone: {
    namespace: 'production', // Switch to production namespace
  },
})

const response = await agent.generate('Find information about deployment', {
  requestContext,
})

This approach allows you to:

Switch between environments (dev/staging/prod)
Adjust performance parameters based on load
Apply different filtering strategies per request

Example: Using request context
Direct link to Example: Using request context

const queryTool = createVectorQueryTool({
  vectorStoreName: 'pinecone',
  indexName: 'docs',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})

When using request context, provide required parameters at execution time via the request context:

const requestContext = new RequestContext<{
  vectorStoreName: string
  indexName: string
  topK: number
  filter: VectorFilter
  databaseConfig: DatabaseConfig
}>()
requestContext.set('vectorStoreName', 'my-store')
requestContext.set('indexName', 'my-index')
requestContext.set('topK', 5)
requestContext.set('filter', { category: 'docs' })
requestContext.set('databaseConfig', {
  pinecone: { namespace: 'runtime-namespace' },
})
requestContext.set('model', 'openai/text-embedding-3-small')

const response = await agent.generate('Find documentation from the knowledge base.', {
  requestContext,
})

For more information on request context, please see:

Usage without a Mastra server
Direct link to Usage without a Mastra server

The tool can be used by itself to retrieve documents matching a query:

src/index.ts
import { RequestContext } from '@mastra/core/request-context'
import { createVectorQueryTool } from '@mastra/rag'
import { PgVector } from '@mastra/pg'

const pgVector = new PgVector({
  id: 'pg-vector',
  connectionString: process.env.POSTGRES_CONNECTION_STRING!,
})

const vectorQueryTool = createVectorQueryTool({
  vectorStoreName: 'pgVector', // optional since we're passing in a store
  vectorStore: pgVector,
  indexName: 'embeddings',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})

const requestContext = new RequestContext()
const queryResult = await vectorQueryTool.execute({ queryText: 'foo', topK: 1 }, { requestContext })

console.log(queryResult.sources)

Dynamic vector store for multi-tenant applications
Direct link to Dynamic vector store for multi-tenant applications

For multi-tenant applications where each tenant has isolated data (e.g., separate PostgreSQL schemas), you can pass a resolver function instead of a static vector store instance. The function receives the request context and can return the appropriate vector store for the current tenant:

src/index.ts
import { createVectorQueryTool, VectorStoreResolver } from '@mastra/rag'
import { PgVector } from '@mastra/pg'

// Cache for tenant-specific vector stores
const vectorStoreCache = new Map<string, PgVector>()

// Resolver function that returns the correct vector store based on tenant
const vectorStoreResolver: VectorStoreResolver = async ({ requestContext }) => {
  const tenantId = requestContext?.get('tenantId')

  if (!tenantId) {
    throw new Error('tenantId is required in request context')
  }

  // Return cached instance or create new one
  if (!vectorStoreCache.has(tenantId)) {
    vectorStoreCache.set(
      tenantId,
      new PgVector({
        id: `pg-vector-${tenantId}`,
        connectionString: process.env.POSTGRES_CONNECTION_STRING!,
        schemaName: `tenant_${tenantId}`, // Each tenant has their own schema
      }),
    )
  }

  return vectorStoreCache.get(tenantId)!
}

const vectorQueryTool = createVectorQueryTool({
  indexName: 'embeddings',
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
  vectorStore: vectorStoreResolver, // Dynamic resolution!
})

// Usage with tenant context
const requestContext = new RequestContext()
requestContext.set('tenantId', 'acme-corp')

const result = await vectorQueryTool.execute(
  { queryText: 'company policies', topK: 5 },
  { requestContext },
)

This pattern is similar to how Agent.memory supports dynamic configuration and enables:

Schema isolation: Each tenant's data in separate PostgreSQL schemas
Database isolation: Route to different database instances per tenant
Dynamic configuration: Adjust vector store settings based on request context

Tool details
Direct link to Tool details

The tool is created with:

ID: VectorQuery {vectorStoreName} {indexName} Tool
Input Schema: Requires queryText and filter objects
Output Schema: Returns relevantContext string

Basic usageDirect link to Basic usage

ParametersDirect link to Parameters

id?:

description?:

model:

vectorStoreName:

indexName:

enableFilter?:

includeVectors?:

includeSources?:

reranker?:

model:

options?:

weights?:

topK?:

databaseConfig?:

pinecone?:

namespace?:

sparseVector?:

pgvector?:

minScore?:

ef?:

probes?:

chroma?:

where?:

whereDocument?:

providerOptions?:

vectorStore?:

ReturnsDirect link to Returns

relevantContext:

sources:

QueryResult object structureDirect link to queryresult-object-structure

Default tool descriptionDirect link to Default tool description

Result handlingDirect link to Result handling

Example with filtersDirect link to Example with filters

Example with rerankingDirect link to Example with reranking

Example with custom descriptionDirect link to Example with custom description

Database-specific configuration examplesDirect link to Database-specific configuration examples

Pinecone ConfigurationDirect link to Pinecone Configuration

pgVector ConfigurationDirect link to pgVector Configuration

Chroma ConfigurationDirect link to Chroma Configuration

Multiple Database ConfigurationsDirect link to Multiple Database Configurations

Runtime Configuration OverrideDirect link to Runtime Configuration Override

Example: Using request contextDirect link to Example: Using request context

Usage without a Mastra serverDirect link to Usage without a Mastra server

Dynamic vector store for multi-tenant applicationsDirect link to Dynamic vector store for multi-tenant applications

Tool detailsDirect link to Tool details

RelatedDirect link to Related

Basic usage
Direct link to Basic usage

Parameters
Direct link to Parameters

Returns
Direct link to Returns

`QueryResult` object structure
Direct link to queryresult-object-structure

Default tool description
Direct link to Default tool description

Result handling
Direct link to Result handling

Example with filters
Direct link to Example with filters

Example with reranking
Direct link to Example with reranking

Example with custom description
Direct link to Example with custom description

Database-specific configuration examples
Direct link to Database-specific configuration examples

Pinecone Configuration
Direct link to Pinecone Configuration

pgVector Configuration
Direct link to pgVector Configuration

Chroma Configuration
Direct link to Chroma Configuration

Multiple Database Configurations
Direct link to Multiple Database Configurations

Runtime Configuration Override
Direct link to Runtime Configuration Override

Example: Using request context
Direct link to Example: Using request context

Usage without a Mastra server
Direct link to Usage without a Mastra server

Dynamic vector store for multi-tenant applications
Direct link to Dynamic vector store for multi-tenant applications

Tool details
Direct link to Tool details

Related
Direct link to Related