AgentBrowser

The @mastra/agent-browser package provides browser automation using Playwright with accessibility-first element targeting. Elements are identified by refs from the page's accessibility tree, making interactions reliable across different page layouts.

When to use AgentBrowser
Direct link to When to use AgentBrowser

Use AgentBrowser when you need:

Reliable element targeting through accessibility refs
Fine-grained control over browser actions
Playwright's robust automation capabilities
Support for keyboard shortcuts and complex interactions

Quickstart
Direct link to Quickstart

Install the package:

npm
pnpm
Yarn
Bun

npm install @mastra/agent-browser

pnpm add @mastra/agent-browser

yarn add @mastra/agent-browser

bun add @mastra/agent-browser

Create a browser instance and assign it to an agent:

src/mastra/agents/browser-agent.ts
import { Agent } from '@mastra/core/agent'
import { AgentBrowser } from '@mastra/agent-browser'

const browser = new AgentBrowser({
  headless: false,
})

export const browserAgent = new Agent({
  id: 'browser-agent',
  name: 'Browser Agent',
  model: 'openai/gpt-5.5',
  browser,
  instructions: `You are a web automation assistant.

When interacting with pages:
1. Use browser_snapshot to get the current page state and element refs
2. Use the refs (like @e1, @e2) to target elements for clicks and typing
3. After actions, take another snapshot to verify the result`,
})

note

For local launches (the default), AgentBrowser requires a Chromium binary installed via Playwright. This is normally downloaded automatically when you install @mastra/agent-browser. If launching the browser fails with "browser executable is missing", run npx playwright install chromium. If you connect to a remote browser using the cdpUrl option, no local Chromium is needed.

Screenshots
Direct link to Screenshots

When the agent uses the browser_screenshot tool, it captures a PNG image of the current page and returns it as image content that vision-capable models can interpret directly.

Use screenshots when you need to visually inspect the page — for example, evaluating images, layout, or colors. For text or structured data, use browser_snapshot instead.

To disable the screenshot tool for models that do not support vision, use excludeTools:

const browser = new AgentBrowser({
  headless: false,
  excludeTools: ['browser_screenshot'],
})

Element refs
Direct link to Element refs

AgentBrowser uses accessibility tree refs to identify elements. When an agent calls browser_snapshot, it receives a text representation of the page with refs like @e1, @e2, etc. The agent then uses these refs with other tools to interact with elements.

note

See AgentBrowser reference for all configuration options and tool details.

When to use AgentBrowserDirect link to When to use AgentBrowser

QuickstartDirect link to Quickstart

ScreenshotsDirect link to Screenshots

Element refsDirect link to Element refs

RelatedDirect link to Related

When to use AgentBrowser
Direct link to When to use AgentBrowser

Quickstart
Direct link to Quickstart

Screenshots
Direct link to Screenshots

Element refs
Direct link to Element refs

Related
Direct link to Related