Graph RAG
This example demonstrates how to implement a Retrieval-Augmented Generation (RAG) system using Mastra, OpenAI embeddings, and PGVector for vector storage.
Overview
The system implements Graph RAG using Mastra and OpenAI. Here’s what it does:
- Sets up a Mastra agent with gpt-4o-mini for response generation
- Creates a GraphRAG tool to manage vector store interactions and knowledge graph creation/traversal
- Chunks text documents into smaller segments
- Creates embeddings for these chunks
- Stores them in a PostgreSQL vector database
- Creates a knowledge graph of relevant chunks based on queries using GraphRAG tool
- Tool returns results from vector store and creates knowledge graph
- Traverses knowledge graph using query
- Generates context-aware responses using the Mastra agent
Setup
Environment Setup
Make sure to set up your environment variables:
OPENAI_API_KEY=your_openai_api_key_here
POSTGRES_CONNECTION_STRING=your_connection_string_here
Dependencies
Then, import the necessary dependencies:
import { openai } from "@ai-sdk/openai";
import { Mastra } from "@mastra/core";
import { Agent } from "@mastra/core/agent";
import { PgVector } from "@mastra/pg";
import { MDocument, createGraphRAGTool } from "@mastra/rag";
import { embedMany } from "ai";
GraphRAG Tool Creation
Using createGraphRAGTool imported from @mastra/rag, you can create a tool that queries the vector database and converts the results into a knowledge graph:
const graphRagTool = createGraphRAGTool({
vectorStoreName: "pgVector",
indexName: "embeddings",
model: openai.embedding("text-embedding-3-small"),
graphOptions: {
dimension: 1536,
threshold: 0.7,
},
});
Agent Configuration
Set up the Mastra agent that will handle the responses:
const ragAgent = new Agent({
name: "GraphRAG Agent",
instructions: `You are a helpful assistant that answers questions based on the provided context. Format your answers as follows:
1. DIRECT FACTS: List only the directly stated facts from the text relevant to the question (2-3 bullet points)
2. CONNECTIONS MADE: List the relationships you found between different parts of the text (2-3 bullet points)
3. CONCLUSION: One sentence summary that ties everything together
Keep each section brief and focus on the most important points.
Important: When asked to answer a question, please base your answer only on the context provided in the tool.
If the context doesn't contain enough information to fully answer the question, please state that explicitly.`,
model: openai("gpt-4o-mini"),
tools: {
graphRagTool,
},
});
Instantiate PgVector and Mastra
Instantiate PgVector and Mastra with the components:
const pgVector = new PgVector(process.env.POSTGRES_CONNECTION_STRING!);
export const mastra = new Mastra({
agents: { ragAgent },
vectors: { pgVector },
});
const agent = mastra.getAgent("ragAgent");
Document Processing
Create a document and process it into chunks:
const doc = MDocument.fromText(`
# Riverdale Heights: Community Development Study
// ... text content ...
`);
const chunks = await doc.chunk({
strategy: "recursive",
size: 512,
overlap: 50,
separator: "\n",
});
Creating and Storing Embeddings
Generate embeddings for the chunks and store them in the vector database:
const { embeddings } = await embedMany({
model: openai.embedding("text-embedding-3-small"),
values: chunks.map(chunk => chunk.text),
});
const vectorStore = mastra.getVector("pgVector");
await vectorStore.createIndex({
indexName: "embeddings",
dimension: 1536,
});
await vectorStore.upsert({
indexName: "embeddings",
vectors: embeddings,
metadata: chunks?.map((chunk: any) => ({ text: chunk.text })),
});
Graph-Based Querying
Try different queries to explore relationships in the data:
const queryOne = "What are the direct and indirect effects of early railway decisions on Riverdale Heights' current state?";
const answerOne = await ragAgent.generate(queryOne);
console.log('\nQuery:', queryOne);
console.log('Response:', answerOne.text);
const queryTwo = 'How have changes in transportation infrastructure affected different generations of local businesses and community spaces?';
const answerTwo = await ragAgent.generate(queryTwo);
console.log('\nQuery:', queryTwo);
console.log('Response:', answerTwo.text);
const queryThree = 'Compare how the Rossi family business and Thompson Steel Works responded to major infrastructure changes, and how their responses affected the community.';
const answerThree = await ragAgent.generate(queryThree);
console.log('\nQuery:', queryThree);
console.log('Response:', answerThree.text);
const queryFour = 'Trace how the transformation of the Thompson Steel Works site has influenced surrounding businesses and cultural spaces from 1932 to present.';
const answerFour = await ragAgent.generate(queryFour);
console.log('\nQuery:', queryFour);
console.log('Response:', answerFour.text);