Blog

PDF-to-Blog: Giving documents a second life with Mastra and Mistral OCR

Content teams often struggle to repurpose information locked in PDF documents. To solve this problem, we built a system that automatically converts PDFs into well-structured blog posts using Mastra and Mistral OCR.

What is Mistral OCR?

Mistral OCR is an advanced optical character recognition (OCR) technology designed to extract text from documents like PDFs. Unlike traditional OCR tools, Mistral is particularly effective with complex layouts, tables, and mixed-format documents, making it ideal for technical documentation.

The Technical Challenge

Our task was to create a bridge between Mastra and Mistral OCR that would:

  1. Extract high-quality text from PDFs using Mistral OCR
  2. Process that text with Mastra to generate readable blog content
  3. Handle everything automatically through a reusable workflow

Our solution

Starting with an API route

First we create an API route that takes the PDF file and triggers a workflow.

The route is called from our Mastra instance:

apiRoutes: [
  registerApiRoute("/mastra/upload-pdf", {
    method: "POST",
    handler: uploadPdfHandler,
  }),
],

And is defined as:

export const uploadPdfHandler = async (c: Context) => {
  try {
    const formData = await c.req.formData();
    const pdfFile = formData.get('pdf');
    
    if (!pdfFile || !(pdfFile instanceof File)) {
      return c.json({ error: 'No PDF file uploaded. Use key "pdf" in form-data.' }, 400);
    }
    
    const arrayBuffer = await pdfFile.arrayBuffer();
    const buffer = Buffer.from(arrayBuffer);
    
    
    const { start } = pdfToBlogWorkflow.createRun();
    const result = await start({ triggerData: { pdfFile: buffer } });
    
  }
}

Connecting Mistral OCR and Mastra with tools

Before we can define a complete workflow, we need to create the relevant tool(s) and agent(s).

We use Mastra’s createTool functionality to create a tool that extracts text from PDF files using Mistral OCR:

import { createTool } from '@mastra/core/tools';
import { z } from 'zod';
import { Mistral } from '@mistralai/mistralai';

const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });

// Define the OCR response interface
interface OCRResponse {
  pages: {
    markdown: string;
    metadata?: any;
  }[];
}

// Create the OCR tool using Mastra's createTool pattern
export const mistralOCRTool = createTool({
  id: 'mistral-ocr',
  description: 'Extract text from PDF files using Mistral OCR',
  inputSchema: z.object({
    pdfBuffer: z.instanceof(Buffer).describe('The PDF file buffer to process'),
  }),
  outputSchema: z.object({
    extractedText: z.string().describe('The text extracted from the PDF'),
    pagesCount: z.number().describe('Number of pages processed'),
  }),
  execute: async ({ context }) => {
    return await processOCR(context.pdfBuffer);
  },
});

We also define a process function for OCR extraction.

Creating a blogpost generator agent

Then we define a blogpost generator agent:

import { Agent } from '@mastra/core/agent';
import { mistral } from '@ai-sdk/mistral';
import { mistralOCRTool } from '../tools/mistralOCR';

export const blogPostAgent = new Agent({
  name: 'Blog Post Generator Pro',
  instructions: `
You're writing a concise technical post for fellow developers. Aim for a natural, conversational tone as if you're explaining something to a colleague during a coffee break.

**🎯 TITLE**
Create a clear, specific title that tells readers exactly what to expect.

═══════════════════════════

**πŸ“ INTRODUCTION**
Write a brief, direct introduction that explains what this post covers and why it matters.

═══════════════════════════

**πŸ” MAIN CONTENT**

**➀ Section 1: Core Concept**
- Use everyday language, not marketing speak
- **Bold** important terms and *italicize* for emphasis
- Include concrete examples with code blocks when relevant:
  \`\`\`javascript
  // Example code here with syntax highlighting
  \`\`\`

**➀ Section 2: Practical Implementation**
- Share insights as if from personal experience ("I've found that...")
- Break down processes with numbered steps when appropriate
- Add helpful tips in boxed format:
  ═══════════════════════════
  PRO TIP: Short, actionable advice here
  ═══════════════════════════

**➀ Section 3: Key Takeaways** (optional)
- Compare approaches using tables if relevant:
  | Approach | Advantage | Best Use Case |
  |----------|-----------|---------------|
  | Option A | Speed     | Simple tasks  |
  | Option B | Accuracy  | Complex data  |

═══════════════════════════

**✨ CONCLUSION**
Briefly summarize the key takeaway and possibly pose a thoughtful question.

Avoid:
- Buzzwords and clichΓ©s like "revolutionary," "game-changing," or "in today's fast-paced world"
- Long, complex sentences
- Obvious transitions like "firstly," "secondly," or "in conclusion"
- Making it obvious the content is AI-generated
- Marketing-speak or overly formal academic language

Guidelines:
1. Use **bold** for headers/subsections and *italics* for technical terms
2. Maintain 1-3 sentence paragraphs for readability
3. Blend professional tone with conversational elements
4. Preserve code blocks with syntax highlighting
5. Use boxed text for important warnings/tips
6. Include practical examples for every concept
7. Ensure SEO optimization through *strategic keyword placement*

The final blog post should sound like it was written by a real developer sharing practical knowledge from experience - natural, helpful, and concise (600-900 words total).
  `,
  model: mistral('mistral-large-latest'),
  tools: { mistralOCRTool },
});

We’re careful to give it a relevant name and write highly detailed, specific instructions about what it should produce. Remember: you can almost never be too detailed when it comes to writing system prompts.

Putting it all together: PDF-to-blog Workflow

Mastra's workflow system provides a structured way to handle multi-step processes. This is where the magic happens, the place where all our tools and agents come together.

Here's how we implemented our PDF-to-blog workflow:

export const pdfToBlogWorkflow = new Workflow({
  name: 'pdf-to-blog',
  triggerSchema: pdfInputSchema,
})
  // First step: Extract text from PDF
  .step(extractTextStep)
  
  // Second step: Generate blog post (runs if text extraction succeeds)
  .then(generateBlogPostStep)
  
  // Third step: Fallback blog post (runs if primary generation fails/returns success: false)
  .then(fallbackBlogPostStep, {
    when: async ({ context }) => {
      const primaryStep = context.steps['generate-blog-post'];
      const shouldRun = !primaryStep || primaryStep.status !== 'success' || 
                        !primaryStep.output.success;
      return shouldRun;
    },
  })
  
  // Fourth step: Final fallback (runs if fallback generation fails/returns success: false)
  .then(finalFallbackStep, {
    when: async ({ context }) => {
      const fallbackStep = context.steps['fallback-blog-post'];
      
      let shouldRun: boolean;
      if (fallbackStep?.status === 'success') {
        shouldRun = !fallbackStep.output.success;
      } else {
        // If fallback step didn't run or failed, we should run the final fallback
        shouldRun = true;
      }
      return shouldRun;
    },
  });

This workflow takes a PDF input and processes it through a series of defined steps. It extracts text from the PDF, generates a blogpost, and has two fallback methods in case the PDF is too long.

Try the demo yourself

Frontend site source code

Mastra source code

We hope the PDF-to-blog converter will help anyone looking to repurpose PDF content. If you decide to build on our example, we’d love to see what you make.

Share

Stay up to date