WebClaw

v1.0.0โ€ขSearch & Data Extractionโ€ขstable

๐Ÿฆ€ ๐Ÿ  ๐ŸŽ ๐Ÿง - Web content extraction for AI agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand,

aiai-agentsai-scrapingclicrawler
Share:
1,175
Stars
0
Downloads
0
Weekly
0/5

What is WebClaw?

WebClaw is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to ๐Ÿฆ€ ๐Ÿ  ๐ŸŽ ๐Ÿง - web content extraction for ai agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand,

๐Ÿฆ€ ๐Ÿ  ๐ŸŽ ๐Ÿง - Web content extraction for AI agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand,

This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • MCP protocol support

Use Cases

Web content extraction and scraping
Crawl, map, and summarize web data for AI
0xMassi

Maintainer

LicenseAGPL-3.0
Languagerust
Versionv1.0.0
UpdatedMay 22, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx webclaw

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use WebClaw

WebClaw is a Rust-based web content extraction tool and MCP server that provides 10 specialized tools for scraping, crawling, mapping, batching, extracting, summarizing, diffing, brand analysis, web search, and multi-source research โ€” all delivering output in AI-ready markdown format. It runs locally on macOS, Linux, and Windows, supports proxy pools for large-scale collection, and integrates with local LLMs (Ollama) or cloud providers (OpenAI, Anthropic) for content processing. AI agents and developers use it as a Firecrawl alternative that can be self-hosted and embedded directly in automated pipelines.

Prerequisites

  • A system supported by the Rust binary: macOS (Apple Silicon or Intel), Linux, or Windows
  • Node.js 18+ if using the npx installer (npx create-webclaw)
  • Optional: WEBCLAW_API_KEY for hosted API features (search and research tools)
  • Optional: OPENAI_API_KEY or ANTHROPIC_API_KEY for LLM-assisted extraction and summarization
  • An MCP-compatible client such as Claude Desktop or Claude Code
1

Install WebClaw

The fastest installation uses the npx-based setup wizard. Alternatively, install via Homebrew on macOS, download a binary from GitHub Releases, or build from source with Cargo.

# Quickest setup
npx create-webclaw

# macOS via Homebrew
brew install webclaw

# Build from source
cargo install webclaw
2

Configure API keys (optional)

Set environment variables for the capabilities you want. Cloud LLM keys enable AI-powered extraction and summarization; WEBCLAW_API_KEY unlocks the hosted search and research tools.

export WEBCLAW_API_KEY="your-webclaw-api-key"
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export OLLAMA_HOST="http://localhost:11434"  # for local LLM
3

Test WebClaw from the command line

Verify the installation by scraping a URL. The default output format is markdown, which is optimized for feeding directly into LLM context.

webclaw https://example.com --format markdown
webclaw https://docs.anthropic.com --format llm
4

Add WebClaw to your MCP client configuration

Register WebClaw as an MCP server so your AI assistant can call its 10 tools directly. The server binary communicates over stdio.

5

Test crawling and batch operations

Use the crawl tool to follow links from a starting URL, or batch to process multiple URLs in parallel. Set --depth and --max-pages to control scope.

webclaw https://docs.rust-lang.org --crawl --depth 2 --max-pages 50
webclaw --batch https://site1.com https://site2.com https://site3.com

WebClaw Examples

Client configuration

Add WebClaw to claude_desktop_config.json. Pass API keys through the env block for LLM-assisted tools.

{
  "mcpServers": {
    "webclaw": {
      "command": "webclaw",
      "args": ["--mcp"],
      "env": {
        "WEBCLAW_API_KEY": "your-webclaw-api-key",
        "OPENAI_API_KEY": "your-openai-api-key"
      }
    }
  }
}

Prompts to try

Example prompts using WebClaw's 10 extraction and research tools.

- "Scrape the pricing page at stripe.com/pricing and extract the plan names and prices as structured data"
- "Crawl the Anthropic documentation starting at docs.anthropic.com and summarize what changed recently"
- "Map all URLs on example.com without downloading page content to see the site structure"
- "Extract the brand colors, fonts, and logo from github.com"
- "Diff the homepage of my-site.com against last week's snapshot and report what changed"

Troubleshooting WebClaw

webclaw command not found after installation

If installed via Cargo, ensure ~/.cargo/bin is in your PATH: add `export PATH="$HOME/.cargo/bin:$PATH"` to your shell profile. For Homebrew, run `brew link webclaw`.

Scrape returns empty content or only navigation/footer text

Add --only-main-content to the scrape command to strip boilerplate. For JavaScript-heavy pages, WebClaw may need to use its headless browser mode โ€” check the documentation for the --js flag.

Search and research tools return 'API key required' errors

The search and research tools require a WEBCLAW_API_KEY from the hosted WebClaw service. Set this key in your environment or in the MCP server env block. The other 8 tools (scrape, crawl, map, batch, extract, summarize, diff, brand) work without an API key.

Frequently Asked Questions about WebClaw

What is WebClaw?

WebClaw is a Model Context Protocol (MCP) server that ๐Ÿฆ€ ๐Ÿ  ๐ŸŽ ๐Ÿง - web content extraction for ai agents. 10 tools: scrape, crawl, map, batch, extract, summarize, diff, brand, It connects AI assistants to external tools and data sources through a standardized interface.

How do I install WebClaw?

Follow the installation instructions on the WebClaw GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with WebClaw?

WebClaw works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is WebClaw free to use?

Yes, WebClaw is open source and available under the AGPL-3.0 license. You can use it freely in both personal and commercial projects.

Browse More Search & Data Extraction MCP Servers

Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "webclaw": { "command": "npx", "args": ["-y", "webclaw"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide โ†’

Ready to use WebClaw?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides