Page Agent
JavaScript in-page GUI agent. Control web interfaces with natural language.
What is Page Agent?
Page Agent is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to javascript in-page gui agent. control web interfaces with natural language.
JavaScript in-page GUI agent. Control web interfaces with natural language.
This server falls under the Browser Automation category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- JavaScript in-page GUI agent. Control web interfaces with na
Use Cases
Maintainer
Works with
Installation
NPM
npx -y @page-agent/mcpManual Installation
npx -y @page-agent/mcpConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use Page Agent
Page Agent (@page-agent/mcp) is an MCP server that bridges AI coding assistants to a Chrome extension, enabling natural language control of any web interface running in the browser. Unlike screenshot-based approaches, it uses text-based DOM manipulation — no multimodal LLM or special browser permissions required. The MCP server exposes three tools (execute_task, get_status, stop_task) and communicates with the Page Agent Chrome extension over a local WebSocket connection. It is ideal for automating web workflows, testing UI interactions, and controlling web apps through AI assistants like Claude Desktop, Cursor, or GitHub Copilot.
Prerequisites
- Node.js 20 or higher installed
- Google Chrome browser installed
- Page Agent Chrome Extension installed from the Chrome Web Store
- An OpenAI-compatible LLM API key (e.g., Aliyun DashScope for Qwen, or OpenAI)
- An MCP-compatible client such as Claude Desktop, Cursor, or GitHub Copilot
Install the Page Agent Chrome Extension
Install the official Page Agent extension from the Chrome Web Store. This extension runs inside Chrome and receives commands from the MCP server over a local WebSocket connection.
https://chromewebstore.google.com/detail/page-agent-ext/akldabonmimlicnjlflnapfeklbfemhjObtain an LLM API key
Page Agent requires an OpenAI-compatible LLM to interpret and plan actions. You can use any provider — for example Aliyun DashScope (Qwen models) or OpenAI. Copy your API key and the base URL for the provider.
Add the MCP server to your client configuration
Configure your MCP client to launch @page-agent/mcp via npx with the required LLM environment variables. The server starts a local HTTP/WebSocket listener on port 38401 by default.
{
"mcpServers": {
"page-agent": {
"command": "npx",
"args": ["-y", "@page-agent/mcp"],
"env": {
"LLM_BASE_URL": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"LLM_API_KEY": "sk-your-api-key-here",
"LLM_MODEL_NAME": "qwen3.5-plus"
}
}
}
}Restart your MCP client
After saving the configuration, restart Claude Desktop, Cursor, or whichever client you use. The MCP server will launch automatically and wait for the Chrome extension to connect.
Open a webpage in Chrome and start issuing commands
Navigate to any webpage in Chrome with the Page Agent extension enabled. Your AI assistant can now control that page using the execute_task tool with a natural language task description.
Page Agent Examples
Client configuration
Claude Desktop config using Aliyun DashScope as the LLM backend. Replace LLM_API_KEY with your actual key, and LLM_MODEL_NAME with your preferred model.
{
"mcpServers": {
"page-agent": {
"command": "npx",
"args": ["-y", "@page-agent/mcp"],
"env": {
"LLM_BASE_URL": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"LLM_API_KEY": "sk-your-api-key-here",
"LLM_MODEL_NAME": "qwen3.5-plus"
}
}
}
}Prompts to try
Example prompts that use the execute_task tool to automate browser interactions on the active Chrome tab.
- "Click the login button on the current page"
- "Fill in the search box with 'MCP servers' and press Enter"
- "Scroll to the bottom of the page and click the 'Load more' button"
- "Check if the page agent is currently connected and busy"
- "Stop the current browser task"Troubleshooting Page Agent
MCP server launches but reports 'not connected' when calling get_status
The Chrome extension must be active on an open Chrome tab. Open Chrome, navigate to any page, and confirm the Page Agent extension icon is visible. The WebSocket connection is established per-tab.
execute_task fails with LLM API errors
Verify that LLM_BASE_URL, LLM_API_KEY, and LLM_MODEL_NAME are all set correctly in the MCP config env block. For OpenAI, set LLM_BASE_URL to 'https://api.openai.com/v1' and LLM_MODEL_NAME to 'gpt-4o' or similar.
Port 38401 conflict — server fails to start
Another process is using port 38401. Set the PORT environment variable to a different port (e.g., PORT=38402) in the MCP server env config, and ensure the Chrome extension is configured to use the same port.
Frequently Asked Questions about Page Agent
What is Page Agent?
Page Agent is a Model Context Protocol (MCP) server that javascript in-page gui agent. control web interfaces with natural language. It connects AI assistants to external tools and data sources through a standardized interface.
How do I install Page Agent?
Install via npm with the command: npx -y @page-agent/mcp. Then add the server configuration to your AI client's JSON config file (e.g., claude_desktop_config.json or .cursor/mcp.json).
Which AI clients work with Page Agent?
Page Agent works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is Page Agent free to use?
Yes, Page Agent is open source and available under the MIT license. You can use it freely in both personal and commercial projects.
Page Agent Alternatives — Similar Browser Automation Servers
Looking for alternatives to Page Agent? Here are other popular browser automation servers you can use with Claude, Cursor, and VS Code.
Chrome DevTools MCP
★ 40.6kAI-powered Chrome automation server with natural language element detection. Control Chrome browser through MCP protocol for testing, debugging, and performance analysis. Features 91% accuracy in element location, works with free AI models, and suppo
UI TARS Desktop
★ 34.9k📇 🏠 - Browser automation capabilities using Puppeteer, both support local and remote browser connection.
Playwright
★ 32.8kA production-ready browser automation server that enables AI assistants to interact with web pages using tools for navigation, element interaction, and data extraction. It features a built-in Inspector UI and robust crash recovery for reliable automa
Chrome
★ 11.7kAn extension-based MCP server that enables AI assistants to control your browser, leveraging existing sessions and login states for automation and content analysis. It provides over 20 tools for semantic tab search, interactive element manipulation,
LAMDA
★ 7.8kThe most powerful Android RPA agent framework, next generation mobile automation.
Browser Tools MCP
★ 7.2kThis application is a powerful browser monitoring and interaction tool that enables AI-powered applications via Anthropic's Model Context Protocol (MCP) to capture and analyze browser data through a Chrome extension.
Browse More Browser Automation MCP Servers
Explore all browser automation servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up Page Agent in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use Page Agent?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.