WebClaw
๐ฆ ๐ ๐ ๐ง - Web content extraction for AI agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand,
What is WebClaw?
WebClaw is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to ๐ฆ ๐ ๐ ๐ง - web content extraction for ai agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand,
๐ฆ ๐ ๐ ๐ง - Web content extraction for AI agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand,
This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- MCP protocol support
Use Cases
Maintainer
Works with
Installation
Manual Installation
npx webclawConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use WebClaw
WebClaw is a Rust-based web content extraction tool and MCP server that provides 10 specialized tools for scraping, crawling, mapping, batching, extracting, summarizing, diffing, brand analysis, web search, and multi-source research โ all delivering output in AI-ready markdown format. It runs locally on macOS, Linux, and Windows, supports proxy pools for large-scale collection, and integrates with local LLMs (Ollama) or cloud providers (OpenAI, Anthropic) for content processing. AI agents and developers use it as a Firecrawl alternative that can be self-hosted and embedded directly in automated pipelines.
Prerequisites
- A system supported by the Rust binary: macOS (Apple Silicon or Intel), Linux, or Windows
- Node.js 18+ if using the npx installer (npx create-webclaw)
- Optional: WEBCLAW_API_KEY for hosted API features (search and research tools)
- Optional: OPENAI_API_KEY or ANTHROPIC_API_KEY for LLM-assisted extraction and summarization
- An MCP-compatible client such as Claude Desktop or Claude Code
Install WebClaw
The fastest installation uses the npx-based setup wizard. Alternatively, install via Homebrew on macOS, download a binary from GitHub Releases, or build from source with Cargo.
# Quickest setup
npx create-webclaw
# macOS via Homebrew
brew install webclaw
# Build from source
cargo install webclawConfigure API keys (optional)
Set environment variables for the capabilities you want. Cloud LLM keys enable AI-powered extraction and summarization; WEBCLAW_API_KEY unlocks the hosted search and research tools.
export WEBCLAW_API_KEY="your-webclaw-api-key"
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export OLLAMA_HOST="http://localhost:11434" # for local LLMTest WebClaw from the command line
Verify the installation by scraping a URL. The default output format is markdown, which is optimized for feeding directly into LLM context.
webclaw https://example.com --format markdown
webclaw https://docs.anthropic.com --format llmAdd WebClaw to your MCP client configuration
Register WebClaw as an MCP server so your AI assistant can call its 10 tools directly. The server binary communicates over stdio.
Test crawling and batch operations
Use the crawl tool to follow links from a starting URL, or batch to process multiple URLs in parallel. Set --depth and --max-pages to control scope.
webclaw https://docs.rust-lang.org --crawl --depth 2 --max-pages 50
webclaw --batch https://site1.com https://site2.com https://site3.comWebClaw Examples
Client configuration
Add WebClaw to claude_desktop_config.json. Pass API keys through the env block for LLM-assisted tools.
{
"mcpServers": {
"webclaw": {
"command": "webclaw",
"args": ["--mcp"],
"env": {
"WEBCLAW_API_KEY": "your-webclaw-api-key",
"OPENAI_API_KEY": "your-openai-api-key"
}
}
}
}Prompts to try
Example prompts using WebClaw's 10 extraction and research tools.
- "Scrape the pricing page at stripe.com/pricing and extract the plan names and prices as structured data"
- "Crawl the Anthropic documentation starting at docs.anthropic.com and summarize what changed recently"
- "Map all URLs on example.com without downloading page content to see the site structure"
- "Extract the brand colors, fonts, and logo from github.com"
- "Diff the homepage of my-site.com against last week's snapshot and report what changed"Troubleshooting WebClaw
webclaw command not found after installation
If installed via Cargo, ensure ~/.cargo/bin is in your PATH: add `export PATH="$HOME/.cargo/bin:$PATH"` to your shell profile. For Homebrew, run `brew link webclaw`.
Scrape returns empty content or only navigation/footer text
Add --only-main-content to the scrape command to strip boilerplate. For JavaScript-heavy pages, WebClaw may need to use its headless browser mode โ check the documentation for the --js flag.
Search and research tools return 'API key required' errors
The search and research tools require a WEBCLAW_API_KEY from the hosted WebClaw service. Set this key in your environment or in the MCP server env block. The other 8 tools (scrape, crawl, map, batch, extract, summarize, diff, brand) work without an API key.
Frequently Asked Questions about WebClaw
What is WebClaw?
WebClaw is a Model Context Protocol (MCP) server that ๐ฆ ๐ ๐ ๐ง - web content extraction for ai agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand, It connects AI assistants to external tools and data sources through a standardized interface.
How do I install WebClaw?
Follow the installation instructions on the WebClaw GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.
Which AI clients work with WebClaw?
WebClaw works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is WebClaw free to use?
Yes, WebClaw is open source and available under the AGPL-3.0 license. You can use it freely in both personal and commercial projects.
WebClaw Alternatives โ Similar Search & Data Extraction Servers
Looking for alternatives to WebClaw? Here are other popular search & data extraction servers you can use with Claude, Cursor, and VS Code.
TrendRadar
โ 58.0kA real-time hotspot monitoring and news aggregation assistant that provides AI-powered analysis of trending topics across multiple platforms via the Model Context Protocol. It enables users to track news and receive automated notifications through va
Scrapling
โ 52.7k๐ท๏ธ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
PDF Math Translate
โ 33.9k[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - ๅบไบ AI ๅฎๆดไฟ็ๆ็็ PDF ๆๆกฃๅ จๆๅ่ฏญ็ฟป่ฏ๏ผๆฏๆ Google/DeepL/Ollama/OpenAI ็ญๆๅก๏ผๆไพ CLI/GUI/MCP/Docker/Zotero
GPT Researcher
โ 27.2kAn autonomous agent that conducts deep research on any data using any LLM providers
Agent Reach
โ 20.1kGive your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu โ one CLI, zero API fees.
Xiaohongshu
โ 13.7kMCP for xiaohongshu.com
Browse More Search & Data Extraction MCP Servers
Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up WebClaw in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use WebClaw?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.