Graph RAG
This example demonstrates how to implement a Retrieval-Augmented Generation (RAG) system using Mastra, OpenAI embeddings, and PGVector for vector storage.
Overview
The system implements Graph RAG using Mastra and OpenAI. Here’s what it does:
- Sets up a Mastra agent with GPT-4o-mini for response generation
- Creates a GraphRAG tool to manage vector store interactions and knowledge graph creation/traversal
- Chunks text documents into smaller segments
- Creates embeddings for these chunks
- Stores them in a PostgreSQL vector database
- Creates a knowledge graph of relevant chunks based on queries using GraphRAG tool
- Tool returns results from vector store and creates knowledge graph
- Traverses knowledge graph using query
- Generates context-aware responses using the Mastra agent
Setup
Environment Setup
Make sure to set up your environment variables:
POSTGRES_CONNECTION_STRING=your_connection_string_here
Dependencies
Then, import the necessary dependencies:
import { Mastra, Agent, EmbedManyResult } from '@mastra/core';
import { embed, MDocument, PgVector, createGraphRAGTool } from '@mastra/rag';
GraphRAG Tool Creation
Using createGraphRAGTool imported from @mastra/rag, you can create a tool that queries the vector database and converts the results into a knowledge graph.
const graphRagTool = createGraphRAGTool({
vectorStoreName: 'pgVector',
indexName: 'embeddings',
options: {
provider: 'OPEN_AI',
model: 'text-embedding-ada-002',
maxRetries: 3,
},
graphOptions: {
dimension: 1536,
threshold: 0.7,
},
topK: 5,
});
Agent Configuration
Set up the Mastra agent that will handle the responses:
export const ragAgent = new Agent({
name: 'GraphRAG Agent',
instructions: `You are a helpful assistant that answers questions based on the provided context. Format your answers as follows:
1. DIRECT FACTS: List only the directly stated facts from the text relevant to the question (2-3 bullet points)
2. CONNECTIONS MADE: List the relationships you found between different parts of the text (2-3 bullet points)
3. CONCLUSION: One sentence summary that ties everything together
Keep each section brief and focus on the most important points.`,
model: {
provider: 'OPEN_AI',
name: 'gpt-4o-mini',
},
tools: {
graphRagTool,
},
});
Instantiate PgVector and Mastra
Instantiate PgVector and Mastra with all components:
const pgVector = new PgVector(process.env.POSTGRES_CONNECTION_STRING!);
export const mastra = new Mastra({
agents: { ragAgent },
vectors: { pgVector },
})
const agent = mastra.getAgent('ragAgent')
Document Processing
Create a document and process it into chunks:
const doc = MDocument.fromText(`Riverdale Heights: Community Development Study...`)
const chunks = await doc.chunk({
strategy: 'recursive',
size: 512,
overlap: 50,
separator: '\n',
})
Creating and Storing Embeddings
Generate embeddings for the chunks and store them in the vector database:
const { embeddings } = await embed(chunks, {
provider: "OPEN_AI",
model: "text-embedding-ada-002",
maxRetries: 3,
}) as EmbedManyResult<string>
const vectorStore = mastra.getVector('pgVector');
await vectorStore.createIndex("embeddings", 1536)
await vectorStore.upsert(
"embeddings",
embeddings,
chunks?.map((chunk: any) => ({ text: chunk.text }))
)
Response Generation
Function to generate responses based on retrieved context:
async function generateResponse(query: string) {
const prompt = `
Please answer the following question using both semantic and graph-based context:
${query}
Please base your answer only on the context provided in the tool. If the context doesn't contain enough information to fully answer the question, please state that explicitly.
`;
const completion = await agent.generate(prompt)
return completion.text
}
Example Usage
async function answerQueries(queries: string[]) {
for (const query of queries) {
try {
const answer = await generateResponse(query)
console.log('\nQuery:', query)
console.log('Response:', answer)
} catch (error) {
console.error(`Error processing query "${query}":`, error)
}
}
}
const queries = [
"What are the direct and indirect effects of early railway decisions on Riverdale Heights' current state?",
'How have changes in transportation infrastructure affected different generations of local businesses and community spaces?',
'Compare how the Rossi family business and Thompson Steel Works responded to major infrastructure changes, and how their responses affected the community.',
'Trace how the transformation of the Thompson Steel Works site has influenced surrounding businesses and cultural spaces from 1932 to present.',
];
await answerQueries(queries)