AgentBrowser class
The AgentBrowser class provides deterministic browser automation using the agent-browser library. It uses accessibility tree snapshots and element refs (e.g., @e5) for precise, reproducible interactions.
Use AgentBrowser when you need reliable, deterministic browser automation. For AI-powered interactions using natural language, see StagehandBrowser.
Usage exampleDirect link to Usage example
import { Agent } from '@mastra/core/agent'
import { AgentBrowser } from '@mastra/agent-browser'
const browser = new AgentBrowser({
headless: true,
viewport: { width: 1280, height: 720 },
scope: 'thread',
})
export const browserAgent = new Agent({
name: 'browser-agent',
instructions: `You can browse the web. Use browser_snapshot to see the page structure,
then interact with elements using their refs (e.g., @e5).`,
model: 'openai/gpt-5.4',
browser,
})
Constructor parametersDirect link to Constructor parameters
headless?:
viewport?:
timeout?:
cdpUrl?:
scope?:
onLaunch?:
onClose?:
screencast?:
ToolsDirect link to Tools
AgentBrowser provides 15 deterministic tools for browser automation. All tools that interact with elements use refs from the accessibility tree snapshot.
Core toolsDirect link to Core tools
| Tool | Description |
|---|---|
browser_goto | Navigate to a URL |
browser_snapshot | Get accessibility tree snapshot with element refs |
browser_click | Click an element by ref |
browser_type | Type text into an element |
browser_press | Press keyboard keys |
browser_select | Select option from dropdown |
browser_scroll | Scroll the page or element |
browser_close | Close the browser |
Extended toolsDirect link to Extended tools
| Tool | Description |
|---|---|
browser_hover | Hover over an element |
browser_back | Go back in browser history |
browser_dialog | Handle browser dialogs (alert, confirm, prompt) |
browser_wait | Wait for element state changes |
browser_tabs | Manage browser tabs (list, new, switch, close) |
browser_drag | Drag and drop elements |
browser_evaluate | Execute JavaScript in the page (escape hatch) |
Tool referenceDirect link to Tool reference
browser_gotoDirect link to browser_goto
Navigate to a URL.
// Tool input
{
"url": "https://example.com",
"waitUntil": "domcontentloaded",
"timeout": 30000
}
| Parameter | Type | Description |
|---|---|---|
url | string | URL to navigate to |
waitUntil | "load" | "domcontentloaded" | "networkidle" | When to consider navigation complete (optional) |
timeout | number | Navigation timeout in ms (optional) |
browser_snapshotDirect link to browser_snapshot
Get an accessibility tree snapshot of the page. Returns element refs like @e5 that you use with other tools.
// Tool input
{
"interactiveOnly": true,
"maxDepth": 10
}
| Parameter | Type | Description |
|---|---|---|
interactiveOnly | boolean | Only include interactive elements (optional) |
maxDepth | number | Maximum tree depth (optional) |
Example output:
[document] Example Page
[banner]
[link @e1] Home
[link @e2] About
[main]
[heading @e3] Welcome
[textbox @e4] Search...
[button @e5] Submit
browser_clickDirect link to browser_click
Click an element using its ref from the snapshot.
{
"ref": "@e5",
"button": "left",
"clickCount": 1,
"modifiers": ["Control", "Shift"]
}
| Parameter | Type | Description |
|---|---|---|
ref | string | Element ref from snapshot (required) |
button | "left" | "right" | "middle" | Mouse button (optional) |
clickCount | number | Number of activations, 2 for double (optional) |
modifiers | string[] | Modifier keys (optional) |
browser_typeDirect link to browser_type
Type text into an input element.
// Tool input
{
"ref": "@e4",
"text": "search query",
"clear": true,
"delay": 50
}
| Parameter | Type | Description |
|---|---|---|
ref | string | Element ref from snapshot (required) |
text | string | Text to type (required) |
clear | boolean | Clear existing content first (optional) |
delay | number | Delay between keystrokes in ms (optional) |
browser_pressDirect link to browser_press
Press keyboard keys.
// Tool input
{
"key": "Enter",
"modifiers": ["Control"]
}
// Key combinations
{ "key": "Control+a" }
{ "key": "Control+c" }
| Parameter | Type | Description |
|---|---|---|
key | string | Key name (e.g., "Enter", "Tab", "Escape", "Control+a") (required) |
modifiers | string[] | Modifier keys (optional) |
browser_selectDirect link to browser_select
Select an option from a dropdown. Provide one of value, label, or index.
// Tool input - by value
{
"ref": "@e10",
"value": "option-value"
}
// Tool input - by label
{
"ref": "@e10",
"label": "Option Text"
}
// Tool input - by index
{
"ref": "@e10",
"index": 0
}
browser_scrollDirect link to browser_scroll
Scroll the page or a specific element.
// Tool input
{
"direction": "down",
"amount": 300,
"ref": "@e15"
}
| Parameter | Type | Description |
|---|---|---|
direction | "up" | "down" | "left" | "right" | Scroll direction (required) |
amount | number | Pixels to scroll, default 300 (optional) |
ref | string | Element to scroll, scrolls page if omitted (optional) |
browser_hoverDirect link to browser_hover
Hover over an element to trigger hover effects.
// Tool input
{
"ref": "@e7"
}
browser_backDirect link to browser_back
Go back in browser history.
// Tool input (no parameters required)
{}
browser_dialogDirect link to browser_dialog
Handle browser dialogs (alert, confirm, prompt). Click an element that triggers a dialog and handle it.
// Tool input
{
"triggerRef": "@e5",
"action": "accept",
"text": "response"
}
| Parameter | Type | Description |
|---|---|---|
triggerRef | string | Element that triggers the dialog (required) |
action | "accept" | "dismiss" | How to handle the dialog (required) |
text | string | Text for prompt dialogs (optional) |
browser_waitDirect link to browser_wait
Wait for an element to reach a specific state.
// Tool input
{
"ref": "@e20",
"state": "visible",
"timeout": 30000
}
| Parameter | Type | Description |
|---|---|---|
ref | string | Element ref to wait for (optional) |
state | "visible" | "hidden" | "attached" | "detached" | State to wait for (optional) |
timeout | number | Max wait time in ms (optional) |
browser_tabsDirect link to browser_tabs
Manage browser tabs.
// List all tabs
{ "action": "list" }
// Open new tab
{ "action": "new", "url": "https://example.com" }
// Switch to tab by index
{ "action": "switch", "index": 0 }
// Close tab by index
{ "action": "close", "index": 1 }
browser_dragDirect link to browser_drag
Drag an element to a target location.
// Tool input
{
"sourceRef": "@e10",
"targetRef": "@e20"
}
| Parameter | Type | Description |
|---|---|---|
sourceRef | string | Element to drag (required) |
targetRef | string | Drop target element (required) |
browser_evaluateDirect link to browser_evaluate
Execute JavaScript in the page context. Use as an escape hatch when other tools don't cover your use case.
// Tool input
{
"script": "document.title",
"returnValue": true
}
| Parameter | Type | Description |
|---|---|---|
script | string | JavaScript to execute (required) |
returnValue | boolean | Whether to return the result (optional) |
browser_closeDirect link to browser_close
Close the browser and clean up resources.
// Tool input (no parameters required)
{}
How refs workDirect link to How refs work
The browser_snapshot tool returns an accessibility tree with element refs like @e1, @e2, etc. These refs are stable identifiers you use with other tools:
- Call
browser_snapshotto see the page structure - Find the element you want to interact with
- Use its ref with interaction tools like
browser_typeorbrowser_scroll.
// 1. Get snapshot
// Returns: [textbox @e4] Search... [link @e5] Home
// 2. Type in the search box
{ "tool": "browser_type", "input": { "ref": "@e4", "text": "mastra" } }
// 3. Navigate to home
{ "tool": "browser_goto", "input": { "url": "https://example.com" } }
RelatedDirect link to Related
- MastraBrowser: Base class reference
- StagehandBrowser: AI-powered alternative
- Browser overview: Conceptual guide
- agent-browser guide: Usage guide