# PG Vector Store The PgVector class provides vector search using [PostgreSQL](https://www.postgresql.org/) with [pgvector](https://github.com/pgvector/pgvector) extension. It provides robust vector similarity search capabilities within your existing PostgreSQL database. ## Constructor Options **connectionString?:** (`string`): PostgreSQL connection URL **host?:** (`string`): PostgreSQL server host **port?:** (`number`): PostgreSQL server port **database?:** (`string`): PostgreSQL database name **user?:** (`string`): PostgreSQL user **password?:** (`string`): PostgreSQL password **ssl?:** (`boolean | ConnectionOptions`): Enable SSL or provide custom SSL configuration **schemaName?:** (`string`): The name of the schema you want the vector store to use. Will use the default schema if not provided. **max?:** (`number`): Maximum number of pool connections (default: 20) **idleTimeoutMillis?:** (`number`): Idle connection timeout in milliseconds (default: 30000) **pgPoolOptions?:** (`PoolConfig`): Additional pg pool configuration options ## Constructor Examples ### Connection String ```ts import { PgVector } from "@mastra/pg"; const vectorStore = new PgVector({ id: 'pg-vector', connectionString: "postgresql://user:password@localhost:5432/mydb", }); ``` ### Host/Port/Database Configuration ```ts const vectorStore = new PgVector({ id: 'pg-vector', host: "localhost", port: 5432, database: "mydb", user: "postgres", password: "password", }); ``` ### Advanced Configuration ```ts const vectorStore = new PgVector({ id: 'pg-vector', connectionString: "postgresql://user:password@localhost:5432/mydb", schemaName: "custom_schema", max: 30, idleTimeoutMillis: 60000, pgPoolOptions: { connectionTimeoutMillis: 5000, allowExitOnIdle: true, }, }); ``` ## Methods ### createIndex() **indexName:** (`string`): Name of the index to create **dimension:** (`number`): Vector dimension (must match your embedding model) **metric?:** (`'cosine' | 'euclidean' | 'dotproduct'`): Distance metric for similarity search (Default: `cosine`) **indexConfig?:** (`IndexConfig`): Index configuration (Default: `{ type: 'ivfflat' }`) **buildIndex?:** (`boolean`): Whether to build the index (Default: `true`) #### IndexConfig **type:** (`'flat' | 'hnsw' | 'ivfflat'`): stringflat:flatSequential scan (no index) that performs exhaustive search.ivfflat:ivfflatClusters vectors into lists for approximate search.hnsw:hnswGraph-based index offering fast search times and high recall. (Default: `ivfflat`) **ivf?:** (`IVFConfig`): objectlists?:numberNumber of lists. If not specified, automatically calculated based on dataset size. (Minimum 100, Maximum 4000) **hnsw?:** (`HNSWConfig`): objectm?:numberMaximum number of connections per node (default: 8)efConstruction?:numberBuild-time complexity (default: 32) #### Memory Requirements HNSW indexes require significant shared memory during construction. For 100K vectors: - Small dimensions (64d): \~60MB with default settings - Medium dimensions (256d): \~180MB with default settings - Large dimensions (384d+): \~250MB+ with default settings Higher M values or efConstruction values will increase memory requirements significantly. Adjust your system's shared memory limits if needed. ### upsert() **indexName:** (`string`): Name of the index to upsert vectors into **vectors:** (`number[][]`): Array of embedding vectors **metadata?:** (`Record[]`): Metadata for each vector **ids?:** (`string[]`): Optional vector IDs (auto-generated if not provided) ### query() **indexName:** (`string`): Name of the index to query **vector:** (`number[]`): Query vector **topK?:** (`number`): Number of results to return (Default: `10`) **filter?:** (`Record`): Metadata filters **includeVector?:** (`boolean`): Whether to include the vector in the result (Default: `false`) **minScore?:** (`number`): Minimum similarity score threshold (Default: `0`) **options?:** (`{ ef?: number; probes?: number }`): objectef?:numberHNSW search parameterprobes?:numberIVF search parameter ### listIndexes() Returns an array of index names as strings. ### describeIndex() **indexName:** (`string`): Name of the index to describe Returns: ```typescript interface PGIndexStats { dimension: number; count: number; metric: "cosine" | "euclidean" | "dotproduct"; type: "flat" | "hnsw" | "ivfflat"; config: { m?: number; efConstruction?: number; lists?: number; probes?: number; }; } ``` ### deleteIndex() **indexName:** (`string`): Name of the index to delete ### updateVector() Update a single vector by ID or by metadata filter. Either `id` or `filter` must be provided, but not both. **indexName:** (`string`): Name of the index containing the vector **id?:** (`string`): ID of the vector to update (mutually exclusive with filter) **filter?:** (`Record`): Metadata filter to identify vector(s) to update (mutually exclusive with id) **update:** (`{ vector?: number[]; metadata?: Record; }`): Object containing the vector and/or metadata to update Updates an existing vector by ID or filter. At least one of vector or metadata must be provided in the update object. ```typescript // Update by ID await pgVector.updateVector({ indexName: "my_vectors", id: "vector123", update: { vector: [0.1, 0.2, 0.3], metadata: { label: "updated" }, }, }); // Update by filter await pgVector.updateVector({ indexName: "my_vectors", filter: { category: "product" }, update: { metadata: { status: "reviewed" }, }, }); ``` ### deleteVector() **indexName:** (`string`): Name of the index containing the vector **id:** (`string`): ID of the vector to delete Deletes a single vector by ID from the specified index. ```typescript await pgVector.deleteVector({ indexName: "my_vectors", id: "vector123" }); ``` ### deleteVectors() Delete multiple vectors by IDs or by metadata filter. Either `ids` or `filter` must be provided, but not both. **indexName:** (`string`): Name of the index containing the vectors to delete **ids?:** (`string[]`): Array of vector IDs to delete (mutually exclusive with filter) **filter?:** (`Record`): Metadata filter to identify vectors to delete (mutually exclusive with ids) ### disconnect() Closes the database connection pool. Should be called when done using the store. ### buildIndex() **indexName:** (`string`): Name of the index to define **metric?:** (`'cosine' | 'euclidean' | 'dotproduct'`): Distance metric for similarity search (Default: `cosine`) **indexConfig:** (`IndexConfig`): Configuration for the index type and parameters Builds or rebuilds an index with specified metric and configuration. Will drop any existing index before creating the new one. ```typescript // Define HNSW index await pgVector.buildIndex("my_vectors", "cosine", { type: "hnsw", hnsw: { m: 8, efConstruction: 32, }, }); // Define IVF index await pgVector.buildIndex("my_vectors", "cosine", { type: "ivfflat", ivf: { lists: 100, }, }); // Define flat index await pgVector.buildIndex("my_vectors", "cosine", { type: "flat", }); ``` ## Response Types Query results are returned in this format: ```typescript interface QueryResult { id: string; score: number; metadata: Record; vector?: number[]; // Only included if includeVector is true } ``` ## Error Handling The store throws typed errors that can be caught: ```typescript try { await store.query({ indexName: "index_name", queryVector: queryVector, }); } catch (error) { if (error instanceof VectorStoreError) { console.log(error.code); // 'connection_failed' | 'invalid_dimension' | etc console.log(error.details); // Additional error context } } ``` ## Index Configuration Guide ### Performance Optimization #### IVFFlat Tuning - **lists parameter**: Set to `sqrt(n) * 2` where n is the number of vectors - More lists = better accuracy but slower build time - Fewer lists = faster build but potentially lower accuracy #### HNSW Tuning - **m parameter**: - 8-16: Moderate accuracy, lower memory - 16-32: High accuracy, moderate memory - 32-64: Very high accuracy, high memory - **efConstruction**: - 32-64: Fast build, good quality - 64-128: Slower build, better quality - 128-256: Slowest build, best quality ### Index Recreation Behavior The system automatically detects configuration changes and only rebuilds indexes when necessary: - Same configuration: Index is kept (no recreation) - Changed configuration: Index is dropped and rebuilt - This prevents the performance issues from unnecessary index recreations ## Best Practices - Regularly evaluate your index configuration to ensure optimal performance. - Adjust parameters like `lists` and `m` based on dataset size and query requirements. - **Monitor index performance** using `describeIndex()` to track usage - Rebuild indexes periodically to maintain efficiency, especially after significant data changes ## Direct Pool Access The `PgVector` class exposes its underlying PostgreSQL connection pool as a public field: ```typescript pgVector.pool; // instance of pg.Pool ``` This enables advanced usage such as running direct SQL queries, managing transactions, or monitoring pool state. When using the pool directly: - You are responsible for releasing clients (`client.release()`) after use. - The pool remains accessible after calling `disconnect()`, but new queries will fail. - Direct access bypasses any validation or transaction logic provided by PgVector methods. This design supports advanced use cases but requires careful resource management by the user. ## Usage Example ### Local embeddings with fastembed Embeddings are numeric vectors used by memory's `semanticRecall` to retrieve related messages by meaning (not keywords). This setup uses `@mastra/fastembed` to generate vector embeddings. Install `fastembed` to get started: **npm**: ```bash npm install @mastra/fastembed@latest ``` **pnpm**: ```bash pnpm add @mastra/fastembed@latest ``` **Yarn**: ```bash yarn add @mastra/fastembed@latest ``` **Bun**: ```bash bun add @mastra/fastembed@latest ``` Add the following to your agent: ```typescript import { Memory } from "@mastra/memory"; import { Agent } from "@mastra/core/agent"; import { PostgresStore, PgVector } from "@mastra/pg"; import { fastembed } from "@mastra/fastembed"; export const pgAgent = new Agent({ id: "pg-agent", name: "PG Agent", instructions: "You are an AI agent with the ability to automatically recall memories from previous interactions.", model: "openai/gpt-5.1", memory: new Memory({ storage: new PostgresStore({ id: 'pg-agent-storage', connectionString: process.env.DATABASE_URL!, }), vector: new PgVector({ id: 'pg-agent-vector', connectionString: process.env.DATABASE_URL!, }), embedder: fastembed, options: { lastMessages: 10, semanticRecall: { topK: 3, messageRange: 2, }, }, }), }); ``` ## Related - [Metadata Filters](https://mastra.ai/reference/rag/metadata-filters)