Skip to Content

Graph RAG

This example demonstrates how to implement a Retrieval-Augmented Generation (RAG) system using Mastra, OpenAI embeddings, and PGVector for vector storage.

Overview

The system implements Graph RAG using Mastra and OpenAI. Here’s what it does:

  1. Sets up a Mastra agent with gpt-4o-mini for response generation
  2. Creates a GraphRAG tool to manage vector store interactions and knowledge graph creation/traversal
  3. Chunks text documents into smaller segments
  4. Creates embeddings for these chunks
  5. Stores them in a PostgreSQL vector database
  6. Creates a knowledge graph of relevant chunks based on queries using GraphRAG tool
    • Tool returns results from vector store and creates knowledge graph
    • Traverses knowledge graph using query
  7. Generates context-aware responses using the Mastra agent

Setup

Environment Setup

Make sure to set up your environment variables:

.env
OPENAI_API_KEY=your_openai_api_key_here POSTGRES_CONNECTION_STRING=your_connection_string_here

Dependencies

Then, import the necessary dependencies:

index.ts
import { openai } from "@ai-sdk/openai"; import { Mastra } from "@mastra/core"; import { Agent } from "@mastra/core/agent"; import { PgVector } from "@mastra/pg"; import { MDocument, createGraphRAGTool } from "@mastra/rag"; import { embedMany } from "ai";

GraphRAG Tool Creation

Using createGraphRAGTool imported from @mastra/rag, you can create a tool that queries the vector database and converts the results into a knowledge graph:

index.ts
const graphRagTool = createGraphRAGTool({ vectorStoreName: "pgVector", indexName: "embeddings", model: openai.embedding("text-embedding-3-small"), graphOptions: { dimension: 1536, threshold: 0.7, }, });

Agent Configuration

Set up the Mastra agent that will handle the responses:

index.ts
const ragAgent = new Agent({ name: "GraphRAG Agent", instructions: `You are a helpful assistant that answers questions based on the provided context. Format your answers as follows: 1. DIRECT FACTS: List only the directly stated facts from the text relevant to the question (2-3 bullet points) 2. CONNECTIONS MADE: List the relationships you found between different parts of the text (2-3 bullet points) 3. CONCLUSION: One sentence summary that ties everything together Keep each section brief and focus on the most important points. Important: When asked to answer a question, please base your answer only on the context provided in the tool. If the context doesn't contain enough information to fully answer the question, please state that explicitly.`, model: openai("gpt-4o-mini"), tools: { graphRagTool, }, });

Instantiate PgVector and Mastra

Instantiate PgVector and Mastra with the components:

index.ts
const pgVector = new PgVector(process.env.POSTGRES_CONNECTION_STRING!); export const mastra = new Mastra({ agents: { ragAgent }, vectors: { pgVector }, }); const agent = mastra.getAgent("ragAgent");

Document Processing

Create a document and process it into chunks:

index.ts
const doc = MDocument.fromText(` # Riverdale Heights: Community Development Study // ... text content ... `); const chunks = await doc.chunk({ strategy: "recursive", size: 512, overlap: 50, separator: "\n", });

Creating and Storing Embeddings

Generate embeddings for the chunks and store them in the vector database:

index.ts
const { embeddings } = await embedMany({ model: openai.embedding("text-embedding-3-small"), values: chunks.map(chunk => chunk.text), }); const vectorStore = mastra.getVector("pgVector"); await vectorStore.createIndex({ indexName: "embeddings", dimension: 1536, }); await vectorStore.upsert({ indexName: "embeddings", vectors: embeddings, metadata: chunks?.map((chunk: any) => ({ text: chunk.text })), });

Graph-Based Querying

Try different queries to explore relationships in the data:

index.ts
const queryOne = "What are the direct and indirect effects of early railway decisions on Riverdale Heights' current state?"; const answerOne = await ragAgent.generate(queryOne); console.log('\nQuery:', queryOne); console.log('Response:', answerOne.text); const queryTwo = 'How have changes in transportation infrastructure affected different generations of local businesses and community spaces?'; const answerTwo = await ragAgent.generate(queryTwo); console.log('\nQuery:', queryTwo); console.log('Response:', answerTwo.text); const queryThree = 'Compare how the Rossi family business and Thompson Steel Works responded to major infrastructure changes, and how their responses affected the community.'; const answerThree = await ragAgent.generate(queryThree); console.log('\nQuery:', queryThree); console.log('Response:', answerThree.text); const queryFour = 'Trace how the transformation of the Thompson Steel Works site has influenced surrounding businesses and cultural spaces from 1932 to present.'; const answerFour = await ragAgent.generate(queryFour); console.log('\nQuery:', queryFour); console.log('Response:', answerFour.text);





View Example on GitHub