# PG Vector Store

The PgVector class provides vector search using [PostgreSQL](https://www.postgresql.org/) with [pgvector](https://github.com/pgvector/pgvector) extension. It provides robust vector similarity search capabilities within your existing PostgreSQL database.

## Constructor Options

**connectionString?:** (`string`): PostgreSQL connection URL

**host?:** (`string`): PostgreSQL server host

**port?:** (`number`): PostgreSQL server port

**database?:** (`string`): PostgreSQL database name

**user?:** (`string`): PostgreSQL user

**password?:** (`string`): PostgreSQL password

**ssl?:** (`boolean | ConnectionOptions`): Enable SSL or provide custom SSL configuration

**schemaName?:** (`string`): The name of the schema you want the vector store to use. Will use the default schema if not provided.

**max?:** (`number`): Maximum number of pool connections (default: 20)

**idleTimeoutMillis?:** (`number`): Idle connection timeout in milliseconds (default: 30000)

**pgPoolOptions?:** (`PoolConfig`): Additional pg pool configuration options

## Constructor Examples

### Connection String

```ts
import { PgVector } from "@mastra/pg";

const vectorStore = new PgVector({
  id: 'pg-vector',
  connectionString: "postgresql://user:password@localhost:5432/mydb",
});
```

### Host/Port/Database Configuration

```ts
const vectorStore = new PgVector({
  id: 'pg-vector',
  host: "localhost",
  port: 5432,
  database: "mydb",
  user: "postgres",
  password: "password",
});
```

### Advanced Configuration

```ts
const vectorStore = new PgVector({
  id: 'pg-vector',
  connectionString: "postgresql://user:password@localhost:5432/mydb",
  schemaName: "custom_schema",
  max: 30,
  idleTimeoutMillis: 60000,
  pgPoolOptions: {
    connectionTimeoutMillis: 5000,
    allowExitOnIdle: true,
  },
});
```

## Methods

### createIndex()

**indexName:** (`string`): Name of the index to create

**dimension:** (`number`): Vector dimension (must match your embedding model)

**metric?:** (`'cosine' | 'euclidean' | 'dotproduct'`): Distance metric for similarity search (Default: `cosine`)

**indexConfig?:** (`IndexConfig`): Index configuration (Default: `{ type: 'ivfflat' }`)

**buildIndex?:** (`boolean`): Whether to build the index (Default: `true`)

#### IndexConfig

**type:** (`'flat' | 'hnsw' | 'ivfflat'`): stringflat:flatSequential scan (no index) that performs exhaustive search.ivfflat:ivfflatClusters vectors into lists for approximate search.hnsw:hnswGraph-based index offering fast search times and high recall. (Default: `ivfflat`)

**ivf?:** (`IVFConfig`): objectlists?:numberNumber of lists. If not specified, automatically calculated based on dataset size. (Minimum 100, Maximum 4000)

**hnsw?:** (`HNSWConfig`): objectm?:numberMaximum number of connections per node (default: 8)efConstruction?:numberBuild-time complexity (default: 32)

#### Memory Requirements

HNSW indexes require significant shared memory during construction. For 100K vectors:

- Small dimensions (64d): \~60MB with default settings
- Medium dimensions (256d): \~180MB with default settings
- Large dimensions (384d+): \~250MB+ with default settings

Higher M values or efConstruction values will increase memory requirements significantly. Adjust your system's shared memory limits if needed.

### upsert()

**indexName:** (`string`): Name of the index to upsert vectors into

**vectors:** (`number[][]`): Array of embedding vectors

**metadata?:** (`Record<string, any>[]`): Metadata for each vector

**ids?:** (`string[]`): Optional vector IDs (auto-generated if not provided)

### query()

**indexName:** (`string`): Name of the index to query

**vector:** (`number[]`): Query vector

**topK?:** (`number`): Number of results to return (Default: `10`)

**filter?:** (`Record<string, any>`): Metadata filters

**includeVector?:** (`boolean`): Whether to include the vector in the result (Default: `false`)

**minScore?:** (`number`): Minimum similarity score threshold (Default: `0`)

**options?:** (`{ ef?: number; probes?: number }`): objectef?:numberHNSW search parameterprobes?:numberIVF search parameter

### listIndexes()

Returns an array of index names as strings.

### describeIndex()

**indexName:** (`string`): Name of the index to describe

Returns:

```typescript
interface PGIndexStats {
  dimension: number;
  count: number;
  metric: "cosine" | "euclidean" | "dotproduct";
  type: "flat" | "hnsw" | "ivfflat";
  config: {
    m?: number;
    efConstruction?: number;
    lists?: number;
    probes?: number;
  };
}
```

### deleteIndex()

**indexName:** (`string`): Name of the index to delete

### updateVector()

Update a single vector by ID or by metadata filter. Either `id` or `filter` must be provided, but not both.

**indexName:** (`string`): Name of the index containing the vector

**id?:** (`string`): ID of the vector to update (mutually exclusive with filter)

**filter?:** (`Record<string, any>`): Metadata filter to identify vector(s) to update (mutually exclusive with id)

**update:** (`{ vector?: number[]; metadata?: Record<string, any>; }`): Object containing the vector and/or metadata to update

Updates an existing vector by ID or filter. At least one of vector or metadata must be provided in the update object.

```typescript
// Update by ID
await pgVector.updateVector({
  indexName: "my_vectors",
  id: "vector123",
  update: {
    vector: [0.1, 0.2, 0.3],
    metadata: { label: "updated" },
  },
});

// Update by filter
await pgVector.updateVector({
  indexName: "my_vectors",
  filter: { category: "product" },
  update: {
    metadata: { status: "reviewed" },
  },
});
```

### deleteVector()

**indexName:** (`string`): Name of the index containing the vector

**id:** (`string`): ID of the vector to delete

Deletes a single vector by ID from the specified index.

```typescript
await pgVector.deleteVector({ indexName: "my_vectors", id: "vector123" });
```

### deleteVectors()

Delete multiple vectors by IDs or by metadata filter. Either `ids` or `filter` must be provided, but not both.

**indexName:** (`string`): Name of the index containing the vectors to delete

**ids?:** (`string[]`): Array of vector IDs to delete (mutually exclusive with filter)

**filter?:** (`Record<string, any>`): Metadata filter to identify vectors to delete (mutually exclusive with ids)

### disconnect()

Closes the database connection pool. Should be called when done using the store.

### buildIndex()

**indexName:** (`string`): Name of the index to define

**metric?:** (`'cosine' | 'euclidean' | 'dotproduct'`): Distance metric for similarity search (Default: `cosine`)

**indexConfig:** (`IndexConfig`): Configuration for the index type and parameters

Builds or rebuilds an index with specified metric and configuration. Will drop any existing index before creating the new one.

```typescript
// Define HNSW index
await pgVector.buildIndex("my_vectors", "cosine", {
  type: "hnsw",
  hnsw: {
    m: 8,
    efConstruction: 32,
  },
});

// Define IVF index
await pgVector.buildIndex("my_vectors", "cosine", {
  type: "ivfflat",
  ivf: {
    lists: 100,
  },
});

// Define flat index
await pgVector.buildIndex("my_vectors", "cosine", {
  type: "flat",
});
```

## Response Types

Query results are returned in this format:

```typescript
interface QueryResult {
  id: string;
  score: number;
  metadata: Record<string, any>;
  vector?: number[]; // Only included if includeVector is true
}
```

## Error Handling

The store throws typed errors that can be caught:

```typescript
try {
  await store.query({
    indexName: "index_name",
    queryVector: queryVector,
  });
} catch (error) {
  if (error instanceof VectorStoreError) {
    console.log(error.code); // 'connection_failed' | 'invalid_dimension' | etc
    console.log(error.details); // Additional error context
  }
}
```

## Index Configuration Guide

### Performance Optimization

#### IVFFlat Tuning

- **lists parameter**: Set to `sqrt(n) * 2` where n is the number of vectors
- More lists = better accuracy but slower build time
- Fewer lists = faster build but potentially lower accuracy

#### HNSW Tuning

- **m parameter**:

  - 8-16: Moderate accuracy, lower memory
  - 16-32: High accuracy, moderate memory
  - 32-64: Very high accuracy, high memory

- **efConstruction**:

  - 32-64: Fast build, good quality
  - 64-128: Slower build, better quality
  - 128-256: Slowest build, best quality

### Index Recreation Behavior

The system automatically detects configuration changes and only rebuilds indexes when necessary:

- Same configuration: Index is kept (no recreation)
- Changed configuration: Index is dropped and rebuilt
- This prevents the performance issues from unnecessary index recreations

## Best Practices

- Regularly evaluate your index configuration to ensure optimal performance.
- Adjust parameters like `lists` and `m` based on dataset size and query requirements.
- **Monitor index performance** using `describeIndex()` to track usage
- Rebuild indexes periodically to maintain efficiency, especially after significant data changes

## Direct Pool Access

The `PgVector` class exposes its underlying PostgreSQL connection pool as a public field:

```typescript
pgVector.pool; // instance of pg.Pool
```

This enables advanced usage such as running direct SQL queries, managing transactions, or monitoring pool state. When using the pool directly:

- You are responsible for releasing clients (`client.release()`) after use.
- The pool remains accessible after calling `disconnect()`, but new queries will fail.
- Direct access bypasses any validation or transaction logic provided by PgVector methods.

This design supports advanced use cases but requires careful resource management by the user.

## Usage Example

### Local embeddings with fastembed

Embeddings are numeric vectors used by memory's `semanticRecall` to retrieve related messages by meaning (not keywords). This setup uses `@mastra/fastembed` to generate vector embeddings.

Install `fastembed` to get started:

**npm**:

```bash
npm install @mastra/fastembed@latest
```

**pnpm**:

```bash
pnpm add @mastra/fastembed@latest
```

**Yarn**:

```bash
yarn add @mastra/fastembed@latest
```

**Bun**:

```bash
bun add @mastra/fastembed@latest
```

Add the following to your agent:

```typescript
import { Memory } from "@mastra/memory";
import { Agent } from "@mastra/core/agent";
import { PostgresStore, PgVector } from "@mastra/pg";
import { fastembed } from "@mastra/fastembed";

export const pgAgent = new Agent({
  id: "pg-agent",
  name: "PG Agent",
  instructions:
    "You are an AI agent with the ability to automatically recall memories from previous interactions.",
  model: "openai/gpt-5.1",
  memory: new Memory({
    storage: new PostgresStore({
      id: 'pg-agent-storage',
      connectionString: process.env.DATABASE_URL!,
    }),
    vector: new PgVector({
      id: 'pg-agent-vector',
      connectionString: process.env.DATABASE_URL!,
    }),
    embedder: fastembed,
    options: {
      lastMessages: 10,
      semanticRecall: {
        topK: 3,
        messageRange: 2,
      },
    },
  }),
});
```

## Related

- [Metadata Filters](https://mastra.ai/reference/rag/metadata-filters)