How do I install Agent Desktop MCP Server?

Follow the setup instructions on the Agent Desktop GitHub repository, then add the server configuration to your AI client.

What category is Agent Desktop MCP Server?

Agent Desktop is categorized under Browser Automation. Browse more servers in these categories on MCPgee.

Agent Desktop

Name: Agent Desktop MCP Server
Author: lahfir

v1.0.0•Browser Automation•stable

Native desktop automation CLI for AI agents. Control any application through OS accessibility trees with structured JSON output and deterministic element refs.

accessibilityaccessibility-apiai-agentsautomationcli

766

Stars

Downloads

Weekly

0/5

View on GitHub

What is Agent Desktop?

Agent Desktop is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to native desktop automation cli for ai agents. control any application through os accessibility trees with structured json output and deterministic element refs.

Native desktop automation CLI for AI agents. Control any application through OS accessibility trees with structured JSON output and deterministic element refs.

This server falls under the Browser Automation category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

Native desktop automation CLI for AI agents. Control any app

Use Cases

Native OS accessibility tree automation

Deterministic desktop application control

lahfir

Maintainer

LicenseApache-2.0

Languagerust

Versionv1.0.0

UpdatedMay 21, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx agent-desktop

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use Agent Desktop

Agent Desktop is a native macOS CLI tool that exposes OS accessibility trees to AI agents via structured JSON output, enabling deterministic desktop automation without GUI scripting or pixel-level vision models. It provides 54 commands covering observation (snapshots, screenshots, element search), interaction (click, type, keyboard shortcuts, drag), and system operations (clipboard, notifications, app management) with a deterministic ref system (@e1, @e2) scoped to stable snapshot IDs. Developers building AI agents that need to control real desktop applications — such as automating Slack, Finder, or any macOS app — use it as a reliable, token-efficient alternative to screenshot-based automation.

Prerequisites

macOS 13.0 (Ventura) or later
Node.js 18+ (for npx installation) or Rust 1.78+ (for building from source)
macOS Accessibility permissions granted to your terminal application (System Settings > Privacy & Security > Accessibility)
An MCP client such as Claude Desktop

Install agent-desktop via npm

Install the agent-desktop CLI globally using npm. This makes the 'agent-desktop' command available system-wide.

npm install -g agent-desktop

Grant accessibility permissions

Agent Desktop requires macOS Accessibility API access. Run the permissions command to check current permission state and get guidance on what to enable.

agent-desktop permissions --request

Take your first accessibility snapshot

Capture a shallow accessibility tree overview of an application. The --skeleton flag limits depth to 3 levels for a fast, token-efficient overview.

agent-desktop snapshot --app Finder --skeleton --compact

Interact with an element using refs

Use the ref (@e12) and snapshot ID returned from a snapshot to click or interact with a specific element. Refs are deterministic within a snapshot session.

# First take a snapshot to get refs
agent-desktop snapshot --app Finder -i --compact

# Then click using the ref and snapshot ID
agent-desktop click @e12 --snapshot s8f3k2p9

Configure the MCP server

Add Agent Desktop to your MCP client configuration file so AI assistants can invoke desktop automation commands as tools.

{
  "mcpServers": {
    "agent-desktop": {
      "command": "npx",
      "args": ["agent-desktop"]
    }
  }
}

Agent Desktop Examples

Client configuration

MCP client configuration for Agent Desktop on macOS.

{
  "mcpServers": {
    "agent-desktop": {
      "command": "npx",
      "args": ["agent-desktop"]
    }
  }
}

Prompts to try

Example prompts to use with Agent Desktop through an MCP client.

- "Take a snapshot of the currently focused application and describe its UI"
- "Open Safari, navigate to https://example.com, and take a screenshot"
- "Type 'Hello World' into the currently focused text field"
- "Press Cmd+S to save the current document in the active application"
- "List all currently running applications on my Mac"

Troubleshooting Agent Desktop

Commands fail with PERM_DENIED error code

Go to System Settings > Privacy & Security > Accessibility and add your terminal application (Terminal, iTerm2, or the app running your MCP client). Run 'agent-desktop permissions --request' again to verify the new state.

STALE_REF error when clicking an element

Refs are scoped to a snapshot ID and become invalid after the UI changes. Take a new snapshot to get fresh refs before interacting with elements.

Snapshots are very large and slow when --skeleton is not used

Use --skeleton for an initial overview (78-96% token reduction), then use --root @eN --snapshot <id> to drill into a specific sub-tree. Add --compact to omit empty nodes and --interactive-only (-i) to limit output to actionable elements.

Frequently Asked Questions about Agent Desktop

What is Agent Desktop?

Agent Desktop is a Model Context Protocol (MCP) server that native desktop automation cli for ai agents. control any application through os accessibility trees with structured json output and deterministic element refs. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Agent Desktop?

Follow the installation instructions on the Agent Desktop GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Agent Desktop?

Agent Desktop works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Agent Desktop free to use?

Yes, Agent Desktop is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

Agent Desktop Alternatives — Similar Browser Automation Servers

Looking for alternatives to Agent Desktop? Here are other popular browser automation servers you can use with Claude, Cursor, and VS Code.

Chrome DevTools MCP

★ 40.6k

AI-powered Chrome automation server with natural language element detection. Control Chrome browser through MCP protocol for testing, debugging, and performance analysis. Features 91% accuracy in element location, works with free AI models, and suppo

UI TARS Desktop

★ 34.9k

📇 🏠 - Browser automation capabilities using Puppeteer, both support local and remote browser connection.

Playwright

★ 32.8k

A production-ready browser automation server that enables AI assistants to interact with web pages using tools for navigation, element interaction, and data extraction. It features a built-in Inspector UI and robust crash recovery for reliable automa

Page Agent

★ 18.0k

JavaScript in-page GUI agent. Control web interfaces with natural language.

Chrome

★ 11.7k

An extension-based MCP server that enables AI assistants to control your browser, leveraging existing sessions and login states for automation and content analysis. It provides over 20 tools for semantic tab search, interactive element manipulation,

LAMDA

★ 7.8k

The most powerful Android RPA agent framework, next generation mobile automation.

Browse More Browser Automation MCP Servers

Explore all browser automation servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Browser Automation Browse All Servers

Set Up Agent Desktop in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "agent-desktop": {
      "command": "npx",
      "args": ["-y", "agent-desktop"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Agent Desktop?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides