# ExtractParams ExtractParams configures metadata extraction from document chunks using LLM analysis. ## Example ```typescript import { MDocument } from "@mastra/rag"; const doc = MDocument.fromText(text); const chunks = await doc.chunk({ extract: { title: true, // Extract titles using default settings summary: true, // Generate summaries using default settings keywords: true, // Extract keywords using default settings }, }); // Example output: // chunks[0].metadata = { // documentTitle: "AI Systems Overview", // sectionSummary: "Overview of artificial intelligence concepts and applications", // excerptKeywords: "KEYWORDS: AI, machine learning, algorithms" // } ``` ## Parameters The `extract` parameter accepts the following fields: ### title?: boolean | TitleExtractorsArgs Enable title extraction. Set to true for default settings, or provide custom configuration. ### summary?: boolean | SummaryExtractArgs Enable summary extraction. Set to true for default settings, or provide custom configuration. ### questions?: boolean | QuestionAnswerExtractArgs Enable question generation. Set to true for default settings, or provide custom configuration. ### keywords?: boolean | KeywordExtractArgs Enable keyword extraction. Set to true for default settings, or provide custom configuration. ### schema?: SchemaExtractArgs Enable structured metadata extraction using a Zod schema. ## Extractor Arguments ### TitleExtractorsArgs ### llm?: MastraLanguageModel AI SDK language model to use for title extraction ### nodes?: number Number of title nodes to extract ### nodeTemplate?: string Custom prompt template for title node extraction. Must include {context} placeholder ### combineTemplate?: string Custom prompt template for combining titles. Must include {context} placeholder ### SummaryExtractArgs ### llm?: MastraLanguageModel AI SDK language model to use for summary extraction ### summaries?: ('self' | 'prev' | 'next')\[] List of summary types to generate. Can only include 'self' (current chunk), 'prev' (previous chunk), or 'next' (next chunk) ### promptTemplate?: string Custom prompt template for summary generation. Must include {context} placeholder ### QuestionAnswerExtractArgs ### llm?: MastraLanguageModel AI SDK language model to use for question generation ### questions?: number Number of questions to generate ### promptTemplate?: string Custom prompt template for question generation. Must include both {context} and {numQuestions} placeholders ### embeddingOnly?: boolean If true, only generate embeddings without actual questions ### KeywordExtractArgs ### llm?: MastraLanguageModel AI SDK language model to use for keyword extraction ### keywords?: number Number of keywords to extract ### promptTemplate?: string Custom prompt template for keyword extraction. Must include both {context} and {maxKeywords} placeholders ### SchemaExtractArgs ### schema: ZodType Zod schema defining the structure of the data to extract. ### llm?: MastraLanguageModel AI SDK language model to use for extraction. ### instructions?: string Instructions for the LLM on what to extract. ### metadataKey?: string Key to nest extraction results under. If omitted, results are spread into the metadata object. ## Advanced Example ```typescript import { MDocument } from "@mastra/rag"; const doc = MDocument.fromText(text); const chunks = await doc.chunk({ extract: { // Title extraction with custom settings title: { nodes: 2, // Extract 2 title nodes nodeTemplate: "Generate a title for this: {context}", combineTemplate: "Combine these titles: {context}", }, // Summary extraction with custom settings summary: { summaries: ["self"], // Generate summaries for current chunk promptTemplate: "Summarize this: {context}", }, // Question generation with custom settings questions: { questions: 3, // Generate 3 questions promptTemplate: "Generate {numQuestions} questions about: {context}", embeddingOnly: false, }, // Keyword extraction with custom settings keywords: { keywords: 5, // Extract 5 keywords promptTemplate: "Extract {maxKeywords} key terms from: {context}", }, // Schema extraction with Zod schema: { schema: z.object({ productName: z.string(), category: z.enum(["electronics", "clothing"]), }), instructions: "Extract product information.", metadataKey: "product", }, }, }); // Example output: // chunks[0].metadata = { // documentTitle: "AI in Modern Computing", // sectionSummary: "Overview of AI concepts and their applications in computing", // questionsThisExcerptCanAnswer: "1. What is machine learning?\n2. How do neural networks work?", // excerptKeywords: "1. Machine learning\n2. Neural networks\n3. Training data", // product: { // productName: "Neural Net 2000", // category: "electronics" // } // } ``` ## Document Grouping for Title Extraction When using the `TitleExtractor`, you can group multiple chunks together for title extraction by specifying a shared `docId` in the `metadata` field of each chunk. All chunks with the same `docId` will receive the same extracted title. If no `docId` is set, each chunk is treated as its own document for title extraction. **Example:** ```ts import { MDocument } from "@mastra/rag"; const doc = new MDocument({ docs: [ { text: "chunk 1", metadata: { docId: "docA" } }, { text: "chunk 2", metadata: { docId: "docA" } }, { text: "chunk 3", metadata: { docId: "docB" } }, ], type: "text", }); await doc.extractMetadata({ title: true }); // The first two chunks will share a title, while the third chunk will be assigned a separate title. ```