MASTRA GATEWAY

Memory and model routing for any AI agent

Give Any Agent Human-Like Memory

Compress chat history into compact memories that help agents remember what matters. Lower token usage and latency without losing important context.

Route All Traffic Through One Gateway

Use one endpoint for every model. Skip provider SDKs — one API key covers everything.

Start building with a free account, or explore our pricing for more

Full Pricing

Starter

Free for everyone

$0/ month
  • 100K observability events+ $10/100K
  • 24 CPU hours+ $0.35/hr
  • 15 days of data retention
  • Unlimited users, deployments, and projects

Teams

For growing teams

$250/ month
  • 1M observability events+ $8/100K
  • 250 CPU hours+ $0.25/hr
  • 6 months of data retention
  • Multiple teams, SSO, and SOC 2 docs

Enterprise

For teams at scale

Custom pricing

Custom volume and retention, with RBAC, audit logs, support and uptime SLAs, and a dedicated support engineer.

Mastra is powering the best AI teams

Case Studies

Frequently asked questions

What is Mastra Gateway?

Mastra Gateway is an AI gateway that adds observational memory and unified model routing to any agent. Use one endpoint for every model and one API key for every provider. Observational memory forms dense observations in the background without interrupting the conversation.

How does Mastra Gateway prevent agents from losing context?

Mastra Gateway prevents context loss by running two background agents that maintain a dense observation log without blocking the conversation or throwing anything away. An Observer watches conversations and compresses message history into concise notes about what happened. A Reflector condenses observations when they grow too long. Compression is typically 5-40x.

How does prompt caching work in Mastra Gateway?

Mastra Gateway appends observations over time rather than rebuilding the prompt each turn. Keeping the prompt prefix stable and cacheable means the longer the conversation runs, the more you save.

How does Mastra Gateway route traffic across model providers?

Mastra Gateway is the single integration point for every model provider your agents need to reach. One API key covers everything across more than 300 models including OpenAI, Anthropic, Google, Meta and Mistral, with no provider SDKs required. Change your base URL and start routing through the gateway using Python, TypeScript, any framework, any client. Plug in your own provider API keys for direct routing to any provider and get the same memory features either way. Run on distributed infrastructure with automatic failover across providers.

Does Mastra Gateway work with agents not built on Mastra?

Mastra Gateway works with any agent stack. Change your base URL and start routing through the gateway using Python, TypeScript, any framework and any client. Observational memory and model routing apply to any agent regardless of how it was built.

Start building today.