Question 1

What is the Gateway?

Accepted Answer

Mastra's Gateway adds observational memory and unified model routing to any agent. Use one endpoint for every model and one API key for every provider. Observational memory forms dense observations in the background without interrupting the conversation.

Question 2

How does the Gateway prevent agents from losing context?

Accepted Answer

The Gateway prevents context loss by running two background agents that maintain a dense observation log without blocking the conversation or throwing anything away. An Observer watches conversations and compresses message history into concise notes about what happened. A Reflector condenses observations when they grow too long. Compression is typically 5-40x.

Question 3

How does prompt caching work in the Gateway?

Accepted Answer

The Gateway appends observations over time rather than rebuilding the prompt each turn. Keeping the prompt prefix stable and cacheable means the longer the conversation runs, the more you save.

Question 4

How does the Gateway route traffic across model providers?

Accepted Answer

The Gateway is the single integration point for every model provider your agents need to reach. One API key covers everything across more than 300 models including OpenAI, Anthropic, Google, Meta and Mistral, with no provider SDKs required. Change your base URL and start routing through the gateway using Python, TypeScript, any framework, any client. Plug in your own provider API keys for direct routing to any provider and get the same memory features either way. Run on distributed infrastructure with automatic failover across providers.

Question 5

Does the Gateway work with agents not built on Mastra?

Accepted Answer

The Gateway works with any agent stack. Change your base URL and start routing through Mastra's open-source Gateway using Python, TypeScript, any framework and any client. Observational memory and model routing apply to any agent regardless of how it was built.

Gateway - Memory and model routing for any AI agent

Give any agent human-like memory

Memory That Never Compacts

Observational Memory

5–40x Compression

Prompt Caching

Route all traffic through one gateway

300+ Models

High Availability

Works with Any Stack

Bring Your Own Keys

Transparent Pricing

Starter

Teams

Enterprise

Mastra is powering the best AI teams

Frequently asked questions

What is the Gateway?

How does the Gateway prevent agents from losing context?

How does prompt caching work in the Gateway?

How does the Gateway route traffic across model providers?

Does the Gateway work with agents not built on Mastra?