ExamplesRAGChain of Thought Prompting

Chain of Thought Prompting

This example demonstrates how to implement a Retrieval-Augmented Generation (RAG) system using Mastra, OpenAI embeddings, and PGVector for vector storage, with an emphasis on chain-of-thought reasoning.

Overview

The system implements RAG using Mastra and OpenAI with chain-of-thought prompting. Here’s what it does:

  1. Sets up a Mastra agent with GPT-4o-mini for response generation
  2. Creates a vector query tool to manage vector store interactions
  3. Chunks text documents into smaller segments
  4. Creates embeddings for these chunks
  5. Stores them in a PostgreSQL vector database
  6. Retrieves relevant chunks based on queries using vector query tool
  7. Generates context-aware responses using chain-of-thought reasoning

Setup

Environment Setup

Make sure to set up your environment variables:

.env
POSTGRES_CONNECTION_STRING=your_connection_string_here

Dependencies

Then, import the necessary dependencies:

src/mastra/index.ts
import { Mastra, Agent, EmbedManyResult } from '@mastra/core';
import { createVectorQueryTool, embed, MDocument, PgVector } from '@mastra/rag';

Vectory Query Tool Creation

Using createVectorQueryTool imported from @mastra/rag, you can create a tool that can query the vector database.

src/mastra/index.ts
const vectorQueryTool = createVectorQueryTool({
  vectorStoreName: 'pgVector',
  indexName: 'embeddings',
  options: {
    provider: 'OPEN_AI',
    model: 'text-embedding-ada-002',
    maxRetries: 3,
  },
  topK: 3,
});

Agent Configuration

Set up the Mastra agent with chain-of-thought prompting instructions:

src/mastra/index.ts
export const ragAgent = new Agent({
  name: 'RAG Agent',
  instructions: `You are a helpful assistant that answers questions based on the provided context.
Follow these steps for each response:
 
1. First, carefully analyze the retrieved context chunks and identify key information.
2. Break down your thinking process about how the retrieved information relates to the query.
3. Explain how you're connecting different pieces from the retrieved chunks.
4. Draw conclusions based only on the evidence in the retrieved context.
5. If the retrieved chunks don't contain enough information, explicitly state what's missing.
 
Format your response as:
THOUGHT PROCESS:
- Step 1: [Initial analysis of retrieved chunks]
- Step 2: [Connections between chunks]
- Step 3: [Reasoning based on chunks]
 
FINAL ANSWER:
[Your concise answer based on the retrieved context]`,
  model: {
    provider: 'OPEN_AI',
    name: 'gpt-4o-mini',
  },
  tools: { contextTool },
})

Instantiate PgVector and Mastra

Instantiate PgVector and Mastra with all components:

src/mastra/index.ts
const pgVector = new PgVector(process.env.POSTGRES_CONNECTION_STRING!);
 
export const mastra = new Mastra({
  agents: { ragAgent },
  vectors: { pgVector },
})
const agent = mastra.getAgent('ragAgent')

Document Processing

Create a document and process it into chunks:

src/mastra/index.ts
const doc = MDocument.fromText(`The Impact of Climate Change on Global Agriculture...`)
 
const chunks = await doc.chunk({
  strategy: 'recursive',
  size: 512,
  overlap: 50,
  separator: '\n',
})

Creating and Storing Embeddings

Generate embeddings for the chunks and store them in the vector database:

src/mastra/index.ts
const { embeddings } = (await embed(chunks, {
  provider: "OPEN_AI",
  model: "text-embedding-ada-002",
  maxRetries: 3,
})) as EmbedManyResult<string>
 
const vectorStore = mastra.getVector('pgVector');
await vectorStore.createIndex("embeddings", 1536)
await vectorStore.upsert(
  "embeddings",
  embeddings,
  chunks?.map((chunk: any) => ({ text: chunk.text }))
)

Response Generation with Chain-of-Thought

Function to generate responses using chain-of-thought reasoning:

src/mastra/index.ts
async function generateResponse(query: string) {
  const prompt = `
    Please answer the following question using chain-of-thought reasoning:
    ${query}
 
    Please base your answer only on the context provided in the tool. If the context doesn't 
    contain enough information to fully answer the question, please state that explicitly.
    Remember: Explain how you're using the retrieved information to reach your conclusions.
    `
 
  const completion = await agent.generate(prompt)
  return completion.text
}

Example Usage

src/mastra/index.ts
async function answerQueries(queries: string[]) {
  for (const query of queries) {
    try {
      const answer = await generateResponse(query)
      console.log('\nQuery:', query)
      console.log('\nReasoning Chain + Retrieved Context Response:')
      console.log(answer)
      console.log('\n-------------------')
    } catch (error) {
      console.error(`Error processing query "${query}":`, error)
    }
  }
}
 
const queries = [
  "What are the main adaptation strategies for farmers?",
  "Analyze how temperature affects crop yields.",
  "What connections can you draw between climate change and food security?",
  "How are farmers implementing solutions to address climate challenges?",
  "What future implications are discussed for agriculture?",
];
 
await answerQueries(queries)





View Example on GitHub

MIT 2025 © Nextra.