Upsert Embeddings

After generating embeddings, you need to store them in a database that supports vector similarity search. This example shows how to store embeddings in various vector databases for later retrieval.

The PgVector class provides methods to create indexes and insert embeddings into PostgreSQL with the pgvector extension.

import { openai } from "@ai-sdk/openai";
import { PgVector } from "@mastra/pg";
import { MDocument } from "@mastra/rag";
import { embedMany } from "ai";

const doc = MDocument.fromText("Your text content...");

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding("text-embedding-3-small"),
});

const pgVector = new PgVector({ connectionString: process.env.POSTGRES_CONNECTION_STRING! });

await pgVector.createIndex({
  indexName: "test_index",
  dimension: 1536,
});

await pgVector.upsert({
  indexName: "test_index",
  vectors: embeddings,
  metadata: chunks?.map((chunk: any) => ({ text: chunk.text })),
});

View source on GitHub

The PineconeVector class provides methods to create indexes and insert embeddings into Pinecone, a managed vector database service.

import { openai } from '@ai-sdk/openai';
import { PineconeVector } from '@mastra/pinecone';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
});

const pinecone = new PineconeVector({
  apiKey: process.env.PINECONE_API_KEY!,
});

await pinecone.createIndex({
  indexName: 'testindex',
  dimension: 1536,
});

await pinecone.upsert({
  indexName: 'testindex',
  vectors: embeddings,
  metadata: chunks?.map(chunk => ({ text: chunk.text })),
});

View source on GitHub

The QdrantVector class provides methods to create collections and insert embeddings into Qdrant, a high-performance vector database.

import { openai } from '@ai-sdk/openai';
import { QdrantVector } from '@mastra/qdrant';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
  maxRetries: 3,
});

const qdrant = new QdrantVector({
  url: process.env.QDRANT_URL,
  apiKey: process.env.QDRANT_API_KEY,
});

await qdrant.createIndex({
  indexName: 'test_collection',
  dimension: 1536,
});

await qdrant.upsert({
  indexName: 'test_collection',
  vectors: embeddings,
  metadata: chunks?.map(chunk => ({ text: chunk.text })),
});

The ChromaVector class provides methods to create collections and insert embeddings into Chroma, an open-source embedding database.

import { openai } from '@ai-sdk/openai';
import { ChromaVector } from '@mastra/chroma';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
});

// Running Chroma locally
const store = new ChromaVector();

// Running on Chroma Cloud
const store = new ChromaVector({
  apiKey: process.env.CHROMA_API_KEY,
  tenant: process.env.CHROMA_TENANT,
  database: process.env.CHROMA_DATABASE
});

await store.createIndex({
  indexName: 'test_collection',
  dimension: 1536,
});

await chroma.upsert({
  indexName: 'test_collection',
  vectors: embeddings,
  metadata: chunks.map(chunk => ({ text: chunk.text })),
  documents: chunks.map(chunk => chunk.text),
});

View source on GitHub

The AstraVector class provides methods to create collections and insert embeddings into DataStax Astra DB, a cloud-native vector database.

import { openai } from '@ai-sdk/openai';
import { AstraVector } from '@mastra/astra';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  model: openai.embedding('text-embedding-3-small'),
  values: chunks.map(chunk => chunk.text),
});

const astra = new AstraVector({
  token: process.env.ASTRA_DB_TOKEN,
  endpoint: process.env.ASTRA_DB_ENDPOINT,
  keyspace: process.env.ASTRA_DB_KEYSPACE,
});

await astra.createIndex({
  indexName: 'test_collection',
  dimension: 1536,
});

await astra.upsert({
  indexName: 'test_collection',
  vectors: embeddings,
  metadata: chunks?.map(chunk => ({ text: chunk.text })),
});

The LibSQLVector class provides methods to create collections and insert embeddings into LibSQL, a fork of SQLite with vector extensions.

import { openai } from "@ai-sdk/openai";
import { LibSQLVector } from "@mastra/core/vector/libsql";
import { MDocument } from "@mastra/rag";
import { embedMany } from "ai";

const doc = MDocument.fromText("Your text content...");

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map((chunk) => chunk.text),
  model: openai.embedding("text-embedding-3-small"),
});

const libsql = new LibSQLVector({
  connectionUrl: process.env.DATABASE_URL,
  authToken: process.env.DATABASE_AUTH_TOKEN, // Optional: for Turso cloud databases
});

await libsql.createIndex({
  indexName: "test_collection",
  dimension: 1536,
});

await libsql.upsert({
  indexName: "test_collection",
  vectors: embeddings,
  metadata: chunks?.map((chunk) => ({ text: chunk.text })),
});

View source on GitHub

The UpstashVector class provides methods to create collections and insert embeddings into Upstash Vector, a serverless vector database.

import { openai } from '@ai-sdk/openai';
import { UpstashVector } from '@mastra/upstash';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
});

const upstash = new UpstashVector({
  url: process.env.UPSTASH_URL,
  token: process.env.UPSTASH_TOKEN,
});

// There is no store.createIndex call here, Upstash creates indexes (known as namespaces in Upstash) automatically
// when you upsert if that namespace does not exist yet.
await upstash.upsert({
  indexName: 'test_collection', // the namespace name in Upstash
  vectors: embeddings,
  metadata: chunks?.map(chunk => ({ text: chunk.text })),
});

The CloudflareVector class provides methods to create collections and insert embeddings into Cloudflare Vectorize, a serverless vector database service.

import { openai } from '@ai-sdk/openai';
import { CloudflareVector } from '@mastra/vectorize';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
});

const vectorize = new CloudflareVector({
  accountId: process.env.CF_ACCOUNT_ID,
  apiToken: process.env.CF_API_TOKEN,
});

await vectorize.createIndex({
  indexName: 'test_collection',
  dimension: 1536,
});

await vectorize.upsert({
  indexName: 'test_collection',
  vectors: embeddings,
  metadata: chunks?.map(chunk => ({ text: chunk.text })),
});

The MongoDBVector class provides methods to create indexes and insert embeddings into MongoDB with Atlas Search.

import { openai } from "@ai-sdk/openai";
import { MongoDBVector } from "@mastra/mongodb";
import { MDocument } from "@mastra/rag";
import { embedMany } from "ai";

const doc = MDocument.fromText("Your text content...");

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding("text-embedding-3-small"),
});

const vectorDB = new MongoDBVector({
  uri: process.env.MONGODB_URI!,
  dbName: process.env.MONGODB_DB_NAME!,
});

await vectorDB.createIndex({
  indexName: "test_index",
  dimension: 1536,
});

await vectorDB.upsert({
  indexName: "test_index",
  vectors: embeddings,
  metadata: chunks?.map((chunk: any) => ({ text: chunk.text })),
});

The OpenSearchVector class provides methods to create indexes and insert embeddings into OpenSearch, a distributed search engine with vector search capabilities.

import { openai } from '@ai-sdk/openai';
import { OpenSearchVector } from '@mastra/opensearch';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
});

const vectorDB = new OpenSearchVector({
  uri: process.env.OPENSEARCH_URI!,
});

await vectorDB.createIndex({
  indexName: 'test_index',
  dimension: 1536,
});

await vectorDB.upsert({
  indexName: 'test_index',
  vectors: embeddings,
  metadata: chunks?.map((chunk: any) => ({ text: chunk.text })),
});

The CouchbaseVector class provides methods to create indexes and insert embeddings into Couchbase, a distributed NoSQL database with vector search capabilities.

import { openai } from '@ai-sdk/openai';
import { CouchbaseVector } from '@mastra/couchbase';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
});

const couchbase = new CouchbaseVector({
  connectionString: process.env.COUCHBASE_CONNECTION_STRING,
  username: process.env.COUCHBASE_USERNAME,
  password: process.env.COUCHBASE_PASSWORD,
  bucketName: process.env.COUCHBASE_BUCKET,
  scopeName: process.env.COUCHBASE_SCOPE,
  collectionName: process.env.COUCHBASE_COLLECTION,
});

await couchbase.createIndex({
  indexName: 'test_collection',
  dimension: 1536,
});

await couchbase.upsert({
  indexName: 'test_collection',
  vectors: embeddings,
  metadata: chunks?.map(chunk => ({ text: chunk.text })),
});

The LanceVectorStore class provides methods to create tables, indexes and insert embeddings into LanceDB, an embedded vector database built on the Lance columnar format.

import { openai } from '@ai-sdk/openai';
import { LanceVectorStore } from '@mastra/lance';
import { MDocument } from '@mastra/rag';
import { embedMany } from 'ai';

const doc = MDocument.fromText('Your text content...');

const chunks = await doc.chunk();

const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: openai.embedding('text-embedding-3-small'),
});

const lance = await LanceVectorStore.create('/path/to/db');

// In LanceDB you need to create a table first
await lance.createIndex({
  tableName: 'myVectors',
  indexName: 'vector',
  dimension: 1536,
});

await lance.upsert({
  tableName: 'myVectors',
  vectors: embeddings,
  metadata: chunks?.map(chunk => ({ text: chunk.text })),
});