aicoolies logo

Page Agent

In-page AI browser agent via a single script tag

Share
open-sourceOpen Source
Visit Website →

Page Agent is Alibaba's open-source JavaScript library that embeds an AI GUI agent directly into any web page through a single script tag injection. Unlike headless browser tools that operate externally, Page Agent works inside the DOM using text-based manipulation for natural language QA testing, enterprise copilots, and making legacy web apps AI-native. It supports BYOLLM with any model provider and requires no backend changes.

Page Agent takes a fundamentally different approach to browser automation by injecting itself directly into the page's DOM rather than controlling the browser from the outside. Where tools like Playwright and Puppeteer use DevTools Protocol to manipulate pages remotely, and vision-based agents rely on screenshots, Page Agent operates as a lightweight JavaScript library that reads and interacts with DOM elements using text-based understanding. This makes it faster, more reliable, and framework-agnostic.

The library enables several practical use cases that are difficult with traditional automation approaches. QA teams can describe test scenarios in natural language and have Page Agent execute them against live web applications. Enterprise teams can overlay AI copilot functionality onto existing internal tools without modifying their backend code. Legacy web applications can gain AI capabilities through a simple script tag addition, bypassing the need for costly rewrites or API integrations.

Backed by Alibaba and with over 15,000 GitHub stars, Page Agent has gained rapid adoption since its launch. It follows a bring-your-own-LLM model, connecting to any OpenAI-compatible API endpoint including local models. The library is distributed under the MIT license and ships as a single JavaScript file that can be added to any web page. Its lightweight in-page approach represents an emerging category of browser AI that complements rather than competes with headless automation tools.

Pricing

Free and open source under MIT license

Platforms

Any web browser — single script tag, BYOLLM

Categories

Tags

Use Cases

Alternatives

Browserless logo

Browserless

Headless browsers in Docker for automation at scale

Browserless is a headless browser-as-a-service platform that deploys Chrome, Firefox, and WebKit in Docker containers for web scraping, testing, and AI agent automation. It provides Puppeteer and Playwright-compatible APIs, a built-in MCP server for connecting AI assistants to browser automation, screenshot and PDF generation, and connection pooling for high-concurrency workloads. Available as self-hosted open source or managed cloud.

freemiumOpen Source
Playwright logo

Playwright

Reliable end-to-end testing

Cross-browser E2E testing framework by Microsoft supporting Chromium, Firefox, and WebKit with one API. Features auto-waiting, tracing with timeline/screenshots/DOM snapshots, codegen for recording tests, and parallel execution. Component testing for React, Vue, Svelte. Built-in API testing, network mocking, and mobile emulation. Known for reliability and speed vs Selenium/Cypress. 70K+ GitHub stars, rapidly becoming the E2E standard.

open-sourceOpen Source
Selenium logo

Selenium

Browser automation framework

The original browser automation framework with multi-language support for Java, Python, JavaScript, and C#. Drives end-to-end testing across all major browsers via the WebDriver protocol. Despite newer alternatives, Selenium remains the industry standard for large-scale automated browser testing, with the largest community and most extensive tooling ecosystem.

open-sourceOpen Source
BrowserOS logo

BrowserOS

Open-source agentic browser that runs local AI agents in your browsing workflow.

BrowserOS is a privacy-first, open-source agentic browser for running AI assistants locally inside real browsing sessions instead of handing every task to a remote cloud browser.

open-sourceOpen Source

Related Tools

Hermes Agent logo

Hermes Agent

Top Pick

Open-source AI agent framework with persistent memory, reusable skills, tools, and messaging gateways

Hermes Agent is an open-source AI agent framework with persistent memory, reusable skills, 40+ tools, cron jobs, and messaging gateways.

open-sourceOpen Source

Accomplish Coworker

Open-source desktop AI coworker for browsing and code execution.

Accomplish Coworker is an MIT-licensed open-source AI coworker that runs on the desktop, combining computer-use style browsing with code execution so agents can research, implement, run, and debug workflows in one local environment.

open-sourceOpen SourceTelemetry

Safari MCP Server

Apple's Safari-native MCP server for web debugging agents

Safari MCP Server is Apple's safaridriver-based MCP server in Safari Technology Preview, giving compatible coding agents local access to Safari page content, console logs, network requests, screenshots, JavaScript evaluation, interactions, viewport controls, and accessibility/performance checks.

freeTelemetry
BeeAI Framework logo

BeeAI Framework

Python and TypeScript framework for production multi-agent systems

BeeAI Framework is an Apache-2.0 toolkit for building production-ready AI agents and multi-agent systems in Python and TypeScript. Its docs cover agents, tools, RAG, memory, workflows, backend providers, serving, and A2A/MCP integration surfaces, making it a vendor-neutral option for teams comparing LangGraph, CrewAI, Mastra, and related agent runtimes.

open-sourceOpen SourceTelemetry
Superserve logo

Superserve

Open-source Firecracker sandboxes for long-running AI agents

Superserve is an open-source sandbox infrastructure layer for AI agents that need durable computers instead of short-lived shells. It runs isolated Firecracker microVMs, supports pause, resume, snapshot, fork, preview URLs, MCP connectivity, SDK/API control, Docker workloads, and self-hosting, while the hosted service adds pay-as-you-go agent sandboxes for teams.

open-sourceOpen Source

Anthropic Agent Skills

Official Claude Agent Skills examples, spec, and plugin marketplace for reusable agent capabilities

Anthropic Agent Skills is Anthropic's official reference repo and Claude Code plugin marketplace for reusable Skill folders. It packages example SKILL.md workflows, document skills, a Claude API skill, templates, and the Agent Skills spec so teams can turn repeatable instructions, scripts, and resources into on-demand Claude capabilities instead of copying prompts across sessions.

freeTelemetry

Used in Stacks