LanguageEN
Home
Docs
Major Features / Browser Automation

Browser Automation

Browser Automation is an opt-in Heddle capability for work where the agent needs rendered page state, visual evidence, or browser interaction instead of plain code inspection or web search.

What it is for

Use Browser Automation when a task depends on what a page actually renders: frontend verification, visual layout checks, product or listing comparison, or a user-requested website workflow.

It is intentionally not enabled by default. Turning it on activates Heddle's built-in browser-automation Agent Skill and adds browser tools to future default agent turns.

The skill teaches the agent when a browser is the right tool. Web search can help discover a starting URL, but browser tools are for inspecting and operating the page state itself.

Good use cases

Frontend verification

Open a local or deployed page after UI changes, inspect the rendered DOM, and capture screenshots as evidence.

Website workflows

Browse a site the user explicitly asked the agent to operate, while respecting domain and action policy.

Product research

Compare visible product/listing pages when search snippets or static HTML are not enough.

Debugging rendered state

Check page title, interactive elements, empty states, route errors, or responsive layout that unit tests did not reveal.

Enable it

Browser Automation is workspace-scoped. Enable it from Settings -> Browser Automation or from chat:

/browser
/browser enable
/browser disable

Tools added to the agent

Once enabled, future default agent turns can use these browser tools:

browser_open
browser_snapshot
browser_click
browser_screenshot
browser_close

Policy and profile model

  • The first browser_open URL establishes the same-domain browsing boundary when no explicit allowlist is configured.
  • Snapshots return scoped refs, and browser_click uses those refs instead of arbitrary selectors.
  • Unsafe actions, off-domain navigation, and ambiguous JavaScript-only clicks can be blocked or require approval.
  • Logged-in sites need a persistent browser profile with a valid session. Profile management is planned, but not yet a polished user flow.

Current roadmap

Settings for selected profile, Chrome channel, headed/headless mode, and profile path visibility.
An open-profile-for-login flow so users can prepare logged-in sessions manually.
Form-safe tools such as browser_type, browser_fill, and browser_press.
Browser evidence and screenshots surfaced directly in the control plane.
A live preview path based on screenshots or CDP screencast instead of embedding Playwright's native headed window.

Related docs