DocsHow toKnowledge Sources

RAG (Retrieval Augmented Generation) in Mastra

Mastra’s RAG package provides three main components for building retrieval-augmented generation systems:

1. Document Processing

The MastraDocument class handles document ingestion and metadata extraction:

const doc = new MastraDocument({ text: "Your content here" });
 
await doc.extractMetadata({
    title: { llm: yourLLM },
    summary: { llm: yourLLM },
    questionsAnswered: { questions: 5 },
    keyword: { keywords: 10 },
    sentence: { chunkSize: 512 }
});

2. Vector Storage

Mastra provides a unified interface for vector databases through the MastraVector abstract class, with implementations for:

PostgreSQL Vector Storage

const vectorStore = new PgVector("postgresql://connection-string");
await vectorStore.createIndex("my_index", 1536); // dimension matches your embedding model
await vectorStore.upsert("my_index", embeddings, metadata);

Pinecone Storage

const vectorStore = new PineconeVector(apiKey, environment);
await vectorStore.createIndex("my_index", 1536);
await vectorStore.query("my_index", queryVector, 10);

3. Admin Interface

The RAG admin UI provides:

  • Vector index management
  • Source tracking
  • Metadata visualization
  • Entity type management

Each vector index tracks:

  • Vector provider (Postgres/Pinecone)
  • Source integration
  • Entity types and fields
  • Custom metadata

Mastra integrates with LlamaIndex for document processing and supports multiple vector database backends through a unified interface.