Skip to main content

DuckDBVector Store

The DuckDB storage implementation provides an embedded high-performance vector search solution using DuckDB, an in-process analytical database. It uses the VSS extension for vector similarity search with HNSW indexing, offering a lightweight and efficient vector database that requires no external server.

It's part of the @mastra/duckdb package and offers efficient vector similarity search with metadata filtering.

Installation
Direct link to Installation

npm install @mastra/duckdb@beta

Usage
Direct link to Usage

import { DuckDBVector } from "@mastra/duckdb";

// Create a new vector store instance
const store = new DuckDBVector({
id: "duckdb-vector",
path: ":memory:", // or './vectors.duckdb' for file persistence
});

// Create an index
await store.createIndex({
indexName: "myCollection",
dimension: 1536,
metric: "cosine",
});

// Add vectors with metadata
const vectors = [[0.1, 0.2, ...], [0.3, 0.4, ...]];
const metadata = [
{ text: "first document", category: "A" },
{ text: "second document", category: "B" },
];
await store.upsert({
indexName: "myCollection",
vectors,
metadata,
});

// Query similar vectors
const queryVector = [0.1, 0.2, ...];
const results = await store.query({
indexName: "myCollection",
queryVector,
topK: 10,
filter: { category: "A" },
});

// Clean up
await store.close();

Constructor Options
Direct link to Constructor Options

id:

string
Unique identifier for the vector store instance

path?:

string
= ':memory:'
Database file path. Use ':memory:' for in-memory database, or a file path like './vectors.duckdb' for persistence.

dimensions?:

number
= 1536
Default dimension for vector embeddings

metric?:

'cosine' | 'euclidean' | 'dotproduct'
= cosine
Default distance metric for similarity search

Methods
Direct link to Methods

createIndex()
Direct link to createIndex()

Creates a new vector collection with optional HNSW index for fast approximate nearest neighbor search.

indexName:

string
Name of the index to create

dimension:

number
Vector dimension size (must match your embedding model)

metric?:

'cosine' | 'euclidean' | 'dotproduct'
= cosine
Distance metric for similarity search

upsert()
Direct link to upsert()

Adds or updates vectors and their metadata in the index.

indexName:

string
Name of the index to insert into

vectors:

number[][]
Array of embedding vectors

metadata?:

Record<string, any>[]
Metadata for each vector

ids?:

string[]
Optional vector IDs (auto-generated UUIDs if not provided)

query()
Direct link to query()

Searches for similar vectors with optional metadata filtering.

indexName:

string
Name of the index to search in

queryVector:

number[]
Query vector to find similar vectors for

topK?:

number
= 10
Number of results to return

filter?:

Filter
Metadata filters using MongoDB-like query syntax

includeVector?:

boolean
= false
Whether to include vector data in results

describeIndex()
Direct link to describeIndex()

Gets information about an index.

indexName:

string
Name of the index to describe

Returns:

interface IndexStats {
dimension: number;
count: number;
metric: "cosine" | "euclidean" | "dotproduct";
}

deleteIndex()
Direct link to deleteIndex()

Deletes an index and all its data.

indexName:

string
Name of the index to delete

listIndexes()
Direct link to listIndexes()

Lists all vector indexes in the database.

Returns: Promise<string[]>

updateVector()
Direct link to updateVector()

Update a single vector by ID or by metadata filter. Either id or filter must be provided, but not both.

indexName:

string
Name of the index containing the vector

id?:

string
ID of the vector entry to update (mutually exclusive with filter)

filter?:

Record<string, any>
Metadata filter to identify vector(s) to update (mutually exclusive with id)

update:

object
Update data containing vector and/or metadata

update.vector?:

number[]
New vector data to update

update.metadata?:

Record<string, any>
New metadata to update

deleteVector()
Direct link to deleteVector()

Deletes a specific vector entry from an index by its ID.

indexName:

string
Name of the index containing the vector

id:

string
ID of the vector entry to delete

deleteVectors()
Direct link to deleteVectors()

Delete multiple vectors by IDs or by metadata filter. Either ids or filter must be provided, but not both.

indexName:

string
Name of the index containing the vectors to delete

ids?:

string[]
Array of vector IDs to delete (mutually exclusive with filter)

filter?:

Record<string, any>
Metadata filter to identify vectors to delete (mutually exclusive with ids)

close()
Direct link to close()

Closes the database connection and releases resources.

await store.close();

Response Types
Direct link to Response Types

Query results are returned in this format:

interface QueryResult {
id: string;
score: number;
metadata: Record<string, any>;
vector?: number[]; // Only included if includeVector is true
}

Filter Operators
Direct link to Filter Operators

DuckDB vector store supports MongoDB-like filter operators:

CategoryOperators
Comparison$eq, $ne, $gt, $gte, $lt, $lte
Logical$and, $or, $not, $nor
Array$in, $nin
Element$exists
Text$contains

Filter Examples
Direct link to Filter Examples

// Allegato operators
const results = await store.query({
indexName: "docs",
queryVector: [...],
filter: {
$and: [
{ category: "electronics" },
{ price: { $gte: 100, $lte: 500 } },
],
},
});

// Nested field access
const results = await store.query({
indexName: "docs",
queryVector: [...],
filter: { "user.profile.tier": "premium" },
});

Distance Metrics
Direct link to Distance Metrics

MetricDescriptionScore InterpretationBest For
cosineCosine similarity0-1 (1 = most similar)Text embeddings, normalized vectors
euclideanL2 distance0-∞ (0 = most similar)Image embeddings, spatial data
dotproductInner productHigher = more similarWhen vector magnitude matters

Error Handling
Direct link to Error Handling

The store throws specific errors for different failure cases:

try {
await store.query({
indexName: "my-collection",
queryVector: queryVector,
});
} catch (error) {
if (error.message.includes("not found")) {
console.error("The specified index does not exist");
} else if (error.message.includes("Invalid identifier")) {
console.error("Index name contains invalid characters");
} else {
console.error("Vector store error:", error.message);
}
}

Common error cases include:

  • Invalid index name format
  • Index/table not found
  • Dimension mismatch between query vector and index
  • Empty filter or ids array in delete/update operations
  • Mutual exclusivity violations (providing both id and filter)

Use Cases
Direct link to Use Cases

Build offline-capable AI applications with semantic search that runs entirely in-process:

const store = new DuckDBVector({
id: "offline-search",
path: "./search.duckdb",
});

Local RAG Pipelines
Direct link to Local RAG Pipelines

Process sensitive documents locally without sending data to cloud vector databases:

const store = new DuckDBVector({
id: "private-rag",
path: "./confidential.duckdb",
dimensions: 1536,
});

Development and Testing
Direct link to Development and Testing

Rapidly prototype vector search features with zero infrastructure:

const store = new DuckDBVector({
id: "dev-store",
path: ":memory:", // Fast in-memory for tests
});