ExamplesRAGChain of Thought Prompting w/Workflow

Chain of Thought Workflow RAG

This example demonstrates how to implement a Retrieval-Augmented Generation (RAG) system using Mastra, OpenAI embeddings, and PGVector for vector storage, with an emphasis on chain-of-thought reasoning using a step-by-step workflow.

Overview

The system implements RAG using Mastra and OpenAI with chain-of-thought prompting through a defined workflow. Here’s what it does:

  1. Sets up a Mastra agent with GPT-4o-mini for response generation
  2. Creates a vector query tool to manage vector store interactions
  3. Defines a workflow with multiple steps for chain-of-thought reasoning
  4. Processes and chunks text documents
  5. Creates and stores embeddings in PostgreSQL
  6. Generates responses through the workflow steps

Setup

Environment Setup

Make sure to set up your environment variables:

.env
POSTGRES_CONNECTION_STRING=your_connection_string_here

Dependencies

Import the necessary dependencies:

src/mastra/index.ts
import { Mastra, Agent, EmbedManyResult, Step, Workflow } from "@mastra/core";
import { createVectorQueryTool, embed, MDocument, PgVector } from "@mastra/rag";
import { z } from "zod";

Workflow Definition

First, define the workflow with its trigger schema:

src/mastra/index.ts
export const ragWorkflow = new Workflow({
  name: "rag-workflow",
  triggerSchema: z.object({
    query: z.string(),
  }),
});

Vector Query Tool Creation

Create a tool for querying the vector database:

src/mastra/index.ts
const vectorQueryTool = createVectorQueryTool({
  vectorStoreName: "pgVector",
  indexName: "embeddings",
  options: {
    provider: "OPEN_AI",
    model: "text-embedding-ada-002",
    maxRetries: 3,
  },
  topK: 3,
});

Agent Configuration

Set up the Mastra agent:

src/mastra/index.ts
export const ragAgent = new Agent({
  name: "RAG Agent",
  instructions: `You are a helpful assistant that answers questions based on the provided context.`,
  model: {
    provider: "OPEN_AI",
    name: "gpt-4o-mini",
  },
  tools: {
    vectorQueryTool,
  },
});

Workflow Steps

The workflow is divided into multiple steps for chain-of-thought reasoning:

1. Context Analysis Step

src/mastra/index.ts
const analyzeContext = new Step({
  id: "analyzeContext",
  outputSchema: z.object({
    initialAnalysis: z.string(),
  }),
  execute: async ({ context, mastra }) => {
    console.log("---------------------------");
    const ragAgent = mastra?.agents?.ragAgent;
    const query = context?.machineContext?.getStepPayload<{ query: string }>(
      "trigger",
    )?.query;
 
    const analysisPrompt = `${query} 1. First, carefully analyze the retrieved context chunks and identify key information.`;
 
    const analysis = await ragAgent?.generate(analysisPrompt);
    console.log(analysis?.text);
    return {
      initialAnalysis: analysis?.text ?? "",
    };
  },
});

2. Thought Breakdown Step

src/mastra/index.ts
const breakdownThoughts = new Step({
  id: "breakdownThoughts",
  outputSchema: z.object({
    breakdown: z.string(),
  }),
  execute: async ({ context, mastra }) => {
    console.log("---------------------------");
    const ragAgent = mastra?.agents?.ragAgent;
    const analysis = context?.machineContext?.getStepPayload<{
      initialAnalysis: string;
    }>("analyzeContext")?.initialAnalysis;
 
    const connectionPrompt = `
      Based on the initial analysis: ${analysis}
 
      2. Break down your thinking process about how the retrieved information relates to the query.
    `;
 
    const connectionAnalysis = await ragAgent?.generate(connectionPrompt);
    console.log(connectionAnalysis?.text);
    return {
      breakdown: connectionAnalysis?.text ?? "",
    };
  },
});

3. Connection Step

src/mastra/index.ts
const connectPieces = new Step({
  id: "connectPieces",
  outputSchema: z.object({
    connections: z.string(),
  }),
  execute: async ({ context, mastra }) => {
    console.log("---------------------------");
    const ragAgent = mastra?.agents?.ragAgent;
    const process = context?.machineContext?.getStepPayload<{
      breakdown: string;
    }>("breakdownThoughts")?.breakdown;
    const connectionPrompt = `
        Based on the breakdown: ${process}
 
        3. Explain how you're connecting different pieces from the retrieved chunks.
    `;
 
    const connections = await ragAgent?.generate(connectionPrompt);
    console.log(connections?.text);
    return {
      connections: connections?.text ?? "",
    };
  },
});

4. Conclusion Step

src/mastra/index.ts
const drawConclusions = new Step({
  id: "drawConclusions",
  outputSchema: z.object({
    conclusions: z.string(),
  }),
  execute: async ({ context, mastra }) => {
    console.log("---------------------------");
    const ragAgent = mastra?.agents?.ragAgent;
    const evidence = context?.machineContext?.getStepPayload<{
      connections: string;
    }>("connectPieces")?.connections;
    const conclusionPrompt = `
        Based on the connections: ${evidence}
 
        4. Draw conclusions based only on the evidence in the retrieved context.
    `;
 
    const conclusions = await ragAgent?.generate(conclusionPrompt);
    console.log(conclusions?.text);
    return {
      conclusions: conclusions?.text ?? "",
    };
  },
});

5. Final Answer Step

src/mastra/index.ts
const finalAnswer = new Step({
  id: "finalAnswer",
  outputSchema: z.object({
    finalAnswer: z.string(),
  }),
  execute: async ({ context, mastra }) => {
    console.log("---------------------------");
    const ragAgent = mastra?.agents?.ragAgent;
    const conclusions = context?.machineContext?.getStepPayload<{
      conclusions: string;
    }>("drawConclusions")?.conclusions;
    const answerPrompt = `
        Based on the conclusions: ${conclusions}
        Format your response as:
        THOUGHT PROCESS:
        - Step 1: [Initial analysis of retrieved chunks]
        - Step 2: [Connections between chunks]
        - Step 3: [Reasoning based on chunks]
 
        FINAL ANSWER:
        [Your concise answer based on the retrieved context]`;
 
    const finalAnswer = await ragAgent?.generate(answerPrompt);
    console.log(finalAnswer?.text);
    return {
      finalAnswer: finalAnswer?.text ?? "",
    };
  },
});

Workflow Configuration

Connect all the steps in the workflow:

src/mastra/index.ts
ragWorkflow
  .step(analyzeContext)
  .then(breakdownThoughts)
  .then(connectPieces)
  .then(drawConclusions)
  .then(finalAnswer);
 
ragWorkflow.commit();

Instantiate PgVector and Mastra

Instantiate PgVector and Mastra with all components:

src/mastra/index.ts
const pgVector = new PgVector(process.env.POSTGRES_CONNECTION_STRING!);
 
export const mastra = new Mastra({
  agents: { ragAgent },
  vectors: { pgVector },
  workflows: { ragWorkflow },
});

Document Processing

Process and chunk the document:

src/mastra/index.ts
const doc = MDocument.fromText(`Your document text here...`);
 
const chunks = await doc.chunk({
  strategy: "recursive",
  size: 512,
  overlap: 50,
  separator: "\n",
});

Embedding Creation and Storage

Generate and store embeddings:

src/mastra/index.ts
const { embeddings } = (await embed(chunks, {
  provider: "OPEN_AI",
  model: "text-embedding-ada-002",
  maxRetries: 3,
})) as EmbedManyResult<string>;
 
const vectorStore = mastra.getVector("pgVector");
await vectorStore.createIndex("embeddings", 1536);
await vectorStore.upsert(
  "embeddings",
  embeddings,
  chunks?.map((chunk: any) => ({ text: chunk.text })),
);

Response Generation

Function to generate responses using the workflow:

src/mastra/index.ts
async function generateResponse(query: string) {
  const prompt = `
    Please answer the following question:
    ${query}
 
    Please base your answer only on the context provided in the tool. If the context doesn't contain enough information to fully answer the question, please state that explicitly.
    `;
 
  const { runId, start } = ragWorkflow.createRun();
 
  const workflowResult = await start({
    triggerData: {
      query: prompt,
    },
  });
 
  return workflowResult;
}

Example Usage

src/mastra/index.ts
const query = "What are the main benefits of telemedicine?";
console.log("\nQuery:", query);
const result = await generateResponse(query);
console.log("\nThought Process:");
console.log(result.results);





View Example on GitHub

MIT 2025 © Nextra.