Content teams often struggle to repurpose information locked in PDF documents. To solve this problem, we built a system that automatically converts PDFs into well-structured blog posts using Mastra and Mistral OCR.
What is Mistral OCR?
Mistral OCR is an advanced optical character recognition (OCR) technology designed to extract text from documents like PDFs. Unlike traditional OCR tools, Mistral is particularly effective with complex layouts, tables, and mixed-format documents, making it ideal for technical documentation.
The Technical Challenge
Our task was to create a bridge between Mastra and Mistral OCR that would:
- Extract high-quality text from PDFs using Mistral OCR
- Process that text with Mastra to generate readable blog content
- Handle everything automatically through a reusable workflow
Our solution
Starting with an API route
First we create an API route that takes the PDF file and triggers a workflow.
The route is called from our Mastra instance:
1apiRoutes: [
2 registerApiRoute("/mastra/upload-pdf", {
3 method: "POST",
4 handler: uploadPdfHandler,
5 }),
6],
And is defined as:
1export const uploadPdfHandler = async (c: Context) => {
2 try {
3 const formData = await c.req.formData();
4 const pdfFile = formData.get('pdf');
5
6 if (!pdfFile || !(pdfFile instanceof File)) {
7 return c.json({ error: 'No PDF file uploaded. Use key "pdf" in form-data.' }, 400);
8 }
9
10 const arrayBuffer = await pdfFile.arrayBuffer();
11 const buffer = Buffer.from(arrayBuffer);
12
13
14 const { start } = pdfToBlogWorkflow.createRun();
15 const result = await start({ triggerData: { pdfFile: buffer } });
16
17 }
18}
Connecting Mistral OCR and Mastra with tools
Before we can define a complete workflow, we need to create the relevant tool(s) and agent(s).
We use Mastraβs createTool
functionality to create a tool that extracts text from PDF files using Mistral OCR:
1import { createTool } from '@mastra/core/tools';
2import { z } from 'zod';
3import { Mistral } from '@mistralai/mistralai';
4
5const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });
6
7// Define the OCR response interface
8interface OCRResponse {
9 pages: {
10 markdown: string;
11 metadata?: any;
12 }[];
13}
14
15// Create the OCR tool using Mastra's createTool pattern
16export const mistralOCRTool = createTool({
17 id: 'mistral-ocr',
18 description: 'Extract text from PDF files using Mistral OCR',
19 inputSchema: z.object({
20 pdfBuffer: z.instanceof(Buffer).describe('The PDF file buffer to process'),
21 }),
22 outputSchema: z.object({
23 extractedText: z.string().describe('The text extracted from the PDF'),
24 pagesCount: z.number().describe('Number of pages processed'),
25 }),
26 execute: async ({ context }) => {
27 return await processOCR(context.pdfBuffer);
28 },
29});
We also define a process function for OCR extraction.
Creating a blogpost generator agent
Then we define a blogpost generator agent:
1import { Agent } from '@mastra/core/agent';
2import { mistral } from '@ai-sdk/mistral';
3import { mistralOCRTool } from '../tools/mistralOCR';
4
5export const blogPostAgent = new Agent({
6 name: 'Blog Post Generator Pro',
7 instructions: `
8You're writing a concise technical post for fellow developers. Aim for a natural, conversational tone as if you're explaining something to a colleague during a coffee break.
9
10**π― TITLE**
11Create a clear, specific title that tells readers exactly what to expect.
12
13βββββββββββββββββββββββββββ
14
15**π INTRODUCTION**
16Write a brief, direct introduction that explains what this post covers and why it matters.
17
18βββββββββββββββββββββββββββ
19
20**π MAIN CONTENT**
21
22**β€ Section 1: Core Concept**
23- Use everyday language, not marketing speak
24- **Bold** important terms and *italicize* for emphasis
25- Include concrete examples with code blocks when relevant:
26 \`\`\`javascript
27 // Example code here with syntax highlighting
28 \`\`\`
29
30**β€ Section 2: Practical Implementation**
31- Share insights as if from personal experience ("I've found that...")
32- Break down processes with numbered steps when appropriate
33- Add helpful tips in boxed format:
34 βββββββββββββββββββββββββββ
35 PRO TIP: Short, actionable advice here
36 βββββββββββββββββββββββββββ
37
38**β€ Section 3: Key Takeaways** (optional)
39- Compare approaches using tables if relevant:
40 | Approach | Advantage | Best Use Case |
41 |----------|-----------|---------------|
42 | Option A | Speed | Simple tasks |
43 | Option B | Accuracy | Complex data |
44
45βββββββββββββββββββββββββββ
46
47**β¨ CONCLUSION**
48Briefly summarize the key takeaway and possibly pose a thoughtful question.
49
50Avoid:
51- Buzzwords and clichΓ©s like "revolutionary," "game-changing," or "in today's fast-paced world"
52- Long, complex sentences
53- Obvious transitions like "firstly," "secondly," or "in conclusion"
54- Making it obvious the content is AI-generated
55- Marketing-speak or overly formal academic language
56
57Guidelines:
581. Use **bold** for headers/subsections and *italics* for technical terms
592. Maintain 1-3 sentence paragraphs for readability
603. Blend professional tone with conversational elements
614. Preserve code blocks with syntax highlighting
625. Use boxed text for important warnings/tips
636. Include practical examples for every concept
647. Ensure SEO optimization through *strategic keyword placement*
65
66The final blog post should sound like it was written by a real developer sharing practical knowledge from experience - natural, helpful, and concise (600-900 words total).
67 `,
68 model: mistral('mistral-large-latest'),
69 tools: { mistralOCRTool },
70});
Weβre careful to give it a relevant name and write highly detailed, specific instructions about what it should produce. Remember: you can almost never be too detailed when it comes to writing system prompts.
Putting it all together: PDF-to-blog Workflow
Mastra's workflow system provides a structured way to handle multi-step processes. This is where the magic happens, the place where all our tools and agents come together.
Here's how we implemented our PDF-to-blog workflow:
1export const pdfToBlogWorkflow = new Workflow({
2 name: 'pdf-to-blog',
3 triggerSchema: pdfInputSchema,
4})
5 // First step: Extract text from PDF
6 .step(extractTextStep)
7
8 // Second step: Generate blog post (runs if text extraction succeeds)
9 .then(generateBlogPostStep)
10
11 // Third step: Fallback blog post (runs if primary generation fails/returns success: false)
12 .then(fallbackBlogPostStep, {
13 when: async ({ context }) => {
14 const primaryStep = context.steps['generate-blog-post'];
15 const shouldRun = !primaryStep || primaryStep.status !== 'success' ||
16 !primaryStep.output.success;
17 return shouldRun;
18 },
19 })
20
21 // Fourth step: Final fallback (runs if fallback generation fails/returns success: false)
22 .then(finalFallbackStep, {
23 when: async ({ context }) => {
24 const fallbackStep = context.steps['fallback-blog-post'];
25
26 let shouldRun: boolean;
27 if (fallbackStep?.status === 'success') {
28 shouldRun = !fallbackStep.output.success;
29 } else {
30 // If fallback step didn't run or failed, we should run the final fallback
31 shouldRun = true;
32 }
33 return shouldRun;
34 },
35 });
This workflow takes a PDF input and processes it through a series of defined steps. It extracts text from the PDF, generates a blogpost, and has two fallback methods in case the PDF is too long.
Links
We hope the PDF-to-blog converter will help anyone looking to repurpose PDF content. If you decide to build on our example, weβd love to see what you make.