MASTRA GATEWAY
AI Gateway with
Observational Memory
Give Any Agent Human-Like Memory
Compress chat history into compact memories that help agents remember what matters. Lower token usage and latency without losing important context.

Memory That Never CompactsStop interrupting users and losing important details with compaction. Form dense observations in the background without blocking the conversation or throwing anything away.
Observational MemoryRun two background agents — an Observer and a Reflector — that watch your agent's conversations and maintain a dense observation log. Replace raw message history as it grows.
5–40x CompressionSet when and how aggressively conversations compress. Configure observation and reflection thresholds to balance memory density against detail retention.
Prompt CachingAppend observations over time rather than rebuilding each turn. Keep the prompt prefix stable and cacheable — the longer the conversation, the more you save.
Route All Traffic Through One Gateway
Use one endpoint for every model. Skip provider SDKs — one API key covers everything.
300+ ModelsAccess OpenAI, Anthropic, Google, Meta, Mistral, and hundreds more. Integrate once and reach every provider through a single gateway.

High AvailabilityRun on distributed infrastructure with automatic failover across providers.
Works with Any StackChange your base URL and start routing through the gateway. Use Python, TypeScript, any framework, any client.
Bring Your Own KeysPlug in your own provider API keys for direct routing to OpenAI, Anthropic, Google, or any provider. Get the same memory features either way.
Transparent PricingPay the same rate as the underlying provider with no markup on tokens. See exactly where your spend goes in the dashboard.
Platform pricingFull Pricing
Starter
$0Free for everyone
- 100K Memory Tokens
- $10/1M Add-on Tokens
- 250MB Retrieval Storage
- 15D Stale Thread Retention
- Initial $5 credit
Teams
$250Per team/month
- 1M Memory Tokens
- $10/1M Add-on Tokens
- 1GB Retrieval Storage
- 6MO Stale Thread Retention
- Bring your own key
Enterprise
Custom
- Everything in Teams plus:
- RBAC
- Support SLA
- Dedicated Support Engineer
- Custom Storage/Retention
Mastra is powering the best AI teamsCase Studies
How Replit Agent 3 creates thousands of Mastra agents every day
How SoftBank is restoring Japan's white-collar productivity using Mastra
How Sanity Built a Content Agent That Actually Understands Your CMS
How Marsh used Mastra to build Agentic Search for 100k employees