aicoolies logo

Agent Browser

Browser automation CLI built for AI agents by Vercel Labs

Share
open-sourceOpen Source
Visit Website →

Agent Browser is a Rust-based browser automation CLI designed specifically for AI agent workflows rather than traditional testing. Developed by Vercel Labs, it provides semantic element selection through a refs system, accessibility tree snapshots, session persistence, and authentication vaults. Unlike Playwright or Puppeteer which target test automation, Agent Browser optimizes for token efficiency and deterministic element selection that gives LLMs reliable browser interaction capabilities.

Agent Browser takes a fundamentally different approach to browser automation by designing every feature around AI agent requirements rather than human-written test scripts. The refs system assigns stable identifiers to page elements through accessibility tree analysis, giving LLMs deterministic element selection without relying on brittle CSS selectors or XPath expressions that break when page layouts change. Text snapshots capture the semantic content of pages in a token-efficient format, reducing the context window consumption that makes browser interaction expensive for language models.

The CLI architecture integrates naturally into terminal-based agent workflows where shell interoperability matters. Session persistence maintains browser state across multiple agent interactions, and authentication vaults securely store credentials so agents can access authenticated pages without exposing secrets in prompts or logs. The streaming browser view provides real-time visibility into what the agent sees, useful for debugging and monitoring automated workflows. Multiple browser backends are supported including Chrome and Lightpanda for different performance and resource profiles.

Built in Rust for performance and deployed as a single binary, Agent Browser has accumulated over 26,000 GitHub stars since its release by Vercel Labs. It integrates natively with Claude Code and supports both headless and headed modes for different use cases. The Apache 2.0 license enables both commercial and open-source usage, and the focused scope on agent-first browser interaction fills a distinct gap between full browser testing frameworks and lightweight web scraping tools in the developer toolchain.

Pricing

Free and open source under Apache 2.0

Platforms

CLI on macOS, Linux, Windows

Categories

Tags

Use Cases

Alternatives

UI-TARS Desktop

ByteDance's open-source multimodal desktop agent with vision-based GUI automation

UI-TARS Desktop is ByteDance's open-source multimodal AI agent that automates desktop and browser interactions using computer vision rather than DOM selectors or accessibility APIs. Powered by the UI-TARS vision model, it can understand and operate any graphical interface by looking at screenshots, making it capable of automating applications that traditional browser automation tools cannot reach, including native desktop apps and complex web UIs.

open-sourceOpen Source
OpenFang logo

OpenFang

Rust-based agent OS with built-in security, WASM sandboxing, and multi-agent runtime

OpenFang is an open-source agent operating system built in Rust that provides a secure multi-agent runtime with WASM sandboxing, auditability layers, and multi-channel communication. It goes beyond typical orchestration SDKs by treating agent security and operational isolation as first-class concerns, making it suitable for teams deploying agents in environments where trust boundaries and audit trails matter.

open-sourceOpen Source
Google Antigravity logo

Google Antigravity

Agent-first development platform from Google with desktop app and CLI

Google Antigravity is Google's AI-powered agentic development platform, announced in November 2025 and expanded with Antigravity 2.0 at I/O 2026, that places autonomous AI agents at the center of software development. Distributed as both a VS Code-based desktop app and the new Antigravity CLI, it runs planning, implementation, and verification agents — backed by Gemini 3.1 Pro/Flash, Claude Sonnet/Opus 4.6, and GPT-OSS 120B — across editor, terminal, and browser.

freemiumOpen SourceTelemetry

Related Tools

Accomplish Coworker

Open-source desktop AI coworker for browsing and code execution.

Accomplish Coworker is an MIT-licensed open-source AI coworker that runs on the desktop, combining computer-use style browsing with code execution so agents can research, implement, run, and debug workflows in one local environment.

open-sourceOpen SourceTelemetry

Safari MCP Server

Apple's Safari-native MCP server for web debugging agents

Safari MCP Server is Apple's safaridriver-based MCP server in Safari Technology Preview, giving compatible coding agents local access to Safari page content, console logs, network requests, screenshots, JavaScript evaluation, interactions, viewport controls, and accessibility/performance checks.

freeTelemetry
BrowserOS logo

BrowserOS

Open-source agentic browser that runs local AI agents in your browsing workflow.

BrowserOS is a privacy-first, open-source agentic browser for running AI assistants locally inside real browsing sessions instead of handing every task to a remote cloud browser.

open-sourceOpen Source
Webwright logo

Webwright

Microsoft browser agent that turns long-horizon web tasks into reusable Playwright code

Webwright is a Microsoft browser-agent project that asks coding models to write, debug, and reuse Playwright scripts instead of relying on one-off stochastic click loops. The approach gives automation teams a more inspectable artifact: scripts can be logged, reviewed, rerun, and maintained like normal test or scraping code. It is especially relevant for long-horizon browser tasks where teams care about determinism, auditability, and resilience to UI changes.

open-sourceOpen Source
rampart

Rampart

Microsoft’s pytest-native red teaming framework for turning AI agent safety findings into CI tests.

RAMPART is an open-source Microsoft framework for safety and security testing of agentic AI applications. It brings red-team findings into a pytest-native workflow so teams can turn prompt injection, unsafe tool use, and behavioral boundary failures into repeatable regression tests. The strongest aicoolies angle is developer workflow: RAMPART makes agent safety part of CI/CD instead of a one-off security review.

open-sourceOpen Source
Requestly logo

Requestly

One tool for intercepting, mocking, and replaying HTTP — acquired by BrowserStack

Requestly is a BrowserStack-backed API client, HTTP interceptor, mock server, and session replay tool for frontend and QA teams. Its current product is commercial/API-client led, while the legacy interceptor/open-source code is AGPLv3. The free plan covers individual workflows, and Pro lists at $12/user/month monthly or $9/user/month annually for collaborative QA and frontend debugging teams.

freemium