When we're talking to folks getting started with Mastra, who are planning meaningful projects, they often ask us about our preferred deployment models for their agents and observability (the Mastra server and Studio respectively).
The answer depends on quite a few factors, so we wrote this guide to walk through how we're seeing different companies do it so you can find the right architecture for you.
Background
Mastra Server is your agentic runtime. It exposes your agents, workflows, and tools as API endpoints, handling request routing, middleware, authentication, request context, and streaming responses.
Mastra Studio is your agentic observability. Devs can chat with agents, visualize workflow runs, inspect traces, and tweak settings. It runs locally during development and can also be deployed to production.
There are 3 use-cases to consider for hosting options:
- Fully managed agent runtime + observability
- Runs in your infra with cloud observability
- Self-hosted runtime + observability
1. Fully managed
The project gets deployed to Mastra's platform to handle the runtime, and all observability is hosted by Mastra. We also have the ability to host your prod database.
A small team looking to build a multi-agent system wanted to remain serverless, so we helped them build a fully managed system. Their Mastra-hosted server uses a workflow orchestrator to direct tasks to agents, while the hosted Studio allows the team to use the observability tooling.
To build a user-facing assistant, a company used Mastra to run their agents and tool calls in our hosted service. The API endpoint connects with their Vite + React frontend and integrates with their Elastic instance for data retrieval, while they use the Studio for quick development iterations.
Hosting with Mastra helped these teams move fast without worrying about scaling and maintenance.
2. Self-hosted server, managed studio
With this architecture, Server is in your own infra, and they connect through an API call to the Server. This is the most common pattern, especially for companies that already have existing hosting infra, but want to add a separate observability layer.
We work with a web data analysis company that deploys agentic runtime on Google Cloud Run, using a BFF tier for both agents and their web app. Their agents send their traces to the cloud-hosted Studio, and the team plays with production agents there.
Another team helps service companies launch WhatsApp chat agents, which run in Docker containers within an AWS EKS cluster. They love Mastra's observability features but don't want to manage the observability infra as they scale up their agents.
In this set-up, you handle the runtime, and we take care of the observability. This fits well for a lot of companies building agents on top of their existing products.
3. Self-hosted runtime + observability
This option allows Studio and Server to live completely in your infra. You run Studio and Server yourself, so nothing leaves your environment.
A team we helped working in healthcare needed all hosting in their infra for HIPAA compliance. Their Mastra agents and workflows are self-hosted on Kubernetes, with the workflows running parallel supervisor agents and routing tasks. The Studio sits in this same infrastructure, alongside a self-hosted Next.js app, so data doesn't leave their private cloud.
A local government agency is building a suite of internal agents on an EKS cluster within the agency's AWS GovCloud. They use the built-in RBAC & Auth and the internal team interfaces with the agents on the observability layer. The architecture helps them to keep all traces, source code, traffic, etc. within their own infra.
Especially for regulatory constraints, a lot of customers have found Mastra's offerings easy to integrate into their existing stack.
The right model for you
Specific implementation strategy for your company obviously depends on a variety of factors unique to your team. If you're interested in Mastra Studio and Server, reach out and we'd love to help you figure out what makes the most sense for you.

