# ![Inference logo](https://models.dev/logos/inference.svg)Inference Access 9 Inference models through Mastra's model router. Authentication is handled automatically using the `INFERENCE_API_KEY` environment variable. Learn more in the [Inference documentation](https://inference.net/models). ```bash INFERENCE_API_KEY=your-api-key ``` ```typescript import { Agent } from "@mastra/core/agent"; const agent = new Agent({ id: "my-agent", name: "My Agent", instructions: "You are a helpful assistant", model: "inference/google/gemma-3" }); // Generate a response const response = await agent.generate("Hello!"); // Stream a response const stream = await agent.stream("Tell me a story"); for await (const chunk of stream) { console.log(chunk); } ``` > **Info:** Mastra uses the OpenAI-compatible `/chat/completions` endpoint. Some provider-specific features may not be available. Check the [Inference documentation](https://inference.net/models) for details. ## Models | Model | Context | Tools | Reasoning | Image | Audio | Video | Input $/1M | Output $/1M | | ---------------------------------------------- | ------- | ----- | --------- | ----- | ----- | ----- | ---------- | ----------- | | `inference/google/gemma-3` | 125K | | | | | | $0.15 | $0.30 | | `inference/meta/llama-3.1-8b-instruct` | 16K | | | | | | $0.03 | $0.03 | | `inference/meta/llama-3.2-11b-vision-instruct` | 16K | | | | | | $0.06 | $0.06 | | `inference/meta/llama-3.2-1b-instruct` | 16K | | | | | | $0.01 | $0.01 | | `inference/meta/llama-3.2-3b-instruct` | 16K | | | | | | $0.02 | $0.02 | | `inference/mistral/mistral-nemo-12b-instruct` | 16K | | | | | | $0.04 | $0.10 | | `inference/osmosis/osmosis-structure-0.6b` | 4K | | | | | | $0.10 | $0.50 | | `inference/qwen/qwen-2.5-7b-vision-instruct` | 125K | | | | | | $0.20 | $0.20 | | `inference/qwen/qwen3-embedding-4b` | 32K | | | | | | $0.01 | — | ## Advanced Configuration ### Custom Headers ```typescript const agent = new Agent({ id: "custom-agent", name: "custom-agent", model: { url: "https://inference.net/v1", id: "inference/google/gemma-3", apiKey: process.env.INFERENCE_API_KEY, headers: { "X-Custom-Header": "value" } } }); ``` ### Dynamic Model Selection ```typescript const agent = new Agent({ id: "dynamic-agent", name: "Dynamic Agent", model: ({ requestContext }) => { const useAdvanced = requestContext.task === "complex"; return useAdvanced ? "inference/qwen/qwen3-embedding-4b" : "inference/google/gemma-3"; } }); ```