How do I install Web Scraping MCP Server?

Follow the setup instructions on the Web Scraping GitHub repository, then add the server configuration to your AI client.

What category is Web Scraping MCP Server?

Web Scraping is categorized under Search & Data Extraction. Browse more servers in these categories on MCPgee.

Web Scraping

Name: Web Scraping Mcp MCP Server
Author: MaitreyaM

v1.0.0•Search & Data Extraction•stable

MCP Server leveraging crawl4ai for web scraping and LLM-based content extraction (Markdown, text snippets, smart extraction). Designed for AI agent integration.

web-scraping-mcpmcpai-integration

Stars

Downloads

Weekly

0/5

View on GitHub

What is Web Scraping?

Web Scraping is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to mcp server leveraging crawl4ai for web scraping and llm-based content extraction (markdown, text snippets, smart extraction). designed for ai agent integration.

MCP Server leveraging crawl4ai for web scraping and LLM-based content extraction (Markdown, text snippets, smart extraction). Designed for AI agent integration.

This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

MCP Server leveraging crawl4ai for web scraping and LLM-base

Use Cases

Scrape web content and extract data using crawl4ai and LLM processing.

Convert web pages to Markdown for AI analysis.

Automate content extraction for agent-integrated workflows.

MaitreyaM

Maintainer

LicenseMIT

Languagepython

Versionv1.0.0

UpdatedMar 7, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx web-scraping-mcp

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use Web Scraping

The Web Scraping MCP Server uses crawl4ai to fetch and process web pages, exposing three tools to AI assistants: a full-page Markdown scraper, a targeted text snippet extractor, and an LLM-powered smart extractor that uses Google Gemini to return structured data from any URL based on natural language instructions. It is designed for AI agent integration and runs as an SSE server, making it easy to connect to Claude Desktop and other MCP clients over HTTP.

Prerequisites

Python 3.9 or later with pip installed
A Google Gemini API key (required for the smart_extract tool)
Docker (optional, for containerized deployment)
An MCP client such as Claude Desktop that supports SSE transport

Clone the repository

Clone the web scraping MCP server repository from GitHub.

git clone https://github.com/MaitreyaM/WEB-SCRAPING-MCP.git
cd WEB-SCRAPING-MCP

Create a virtual environment and install dependencies

Set up an isolated Python environment and install the required packages including crawl4ai and FastMCP.

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Configure environment variables

Create a .env file in the project root with your Google Gemini API key. This is required for the smart_extract tool to function.

# .env
GOOGLE_API_KEY=your-google-gemini-api-key-here

Start the MCP server

Run the server script. It starts an SSE server on port 8002 by default. Keep this terminal open while using Claude.

python server.py
# Server listens at http://127.0.0.1:8002/sse

# Or with Docker:
docker build -t crawl4ai-mcp-server .
docker run -it --rm -p 8002:8002 --env-file .env crawl4ai-mcp-server

Configure your MCP client

Add the web scraping server to your Claude Desktop configuration using the SSE transport URL.

{
  "mcpServers": {
    "web-scraping-mcp": {
      "url": "http://127.0.0.1:8002/sse"
    }
  }
}

Web Scraping Examples

Client configuration

Claude Desktop configuration connecting to the web scraping server via SSE transport on the default port.

{
  "mcpServers": {
    "web-scraping-mcp": {
      "url": "http://127.0.0.1:8002/sse"
    }
  }
}

Prompts to try

These prompts exercise all three tools: full-page Markdown scraping, text snippet extraction, and LLM-powered structured extraction.

- "Scrape the content of https://example.com and give me a summary"
- "Find text snippets about pricing on https://some-saas-product.com"
- "Extract the names, prices, and descriptions of all products listed on https://shop.example.com using smart extraction"
- "Convert the blog post at https://blog.example.com/post to Markdown format"

Troubleshooting Web Scraping

smart_extract tool fails with an API key error

Ensure GOOGLE_API_KEY is set in your .env file and the file is in the same directory where you run the server. The smart_extract tool uses Google Gemini and will not function without a valid API key. Get a key at https://aistudio.google.com/.

Claude Desktop cannot connect to the SSE server

Verify the server is running by opening http://127.0.0.1:8002/sse in your browser — you should see an SSE event stream. If the server is not running, start it with `python server.py`. Also confirm the URL in your Claude Desktop config matches exactly (including the /sse path).

crawl4ai fails to fetch JavaScript-heavy pages

crawl4ai supports Playwright for JavaScript rendering. Run `playwright install` after installing requirements to ensure browser drivers are available for pages that require JavaScript execution.

Frequently Asked Questions about Web Scraping

What is Web Scraping?

Web Scraping is a Model Context Protocol (MCP) server that mcp server leveraging crawl4ai for web scraping and llm-based content extraction (markdown, text snippets, smart extraction). designed for ai agent integration. It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Web Scraping?

Follow the installation instructions on the Web Scraping GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Web Scraping?

Web Scraping works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Web Scraping free to use?

Yes, Web Scraping is open source and available under the MIT license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

Web Scraping Alternatives — Similar Search & Data Extraction Servers

Looking for alternatives to Web Scraping? Here are other popular search & data extraction servers you can use with Claude, Cursor, and VS Code.

TrendRadar

★ 58.0k

A real-time hotspot monitoring and news aggregation assistant that provides AI-powered analysis of trending topics across multiple platforms via the Model Context Protocol. It enables users to track news and receive automated notifications through va

Scrapling

★ 52.7k

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

PDF Math Translate

★ 33.9k

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/MCP/Docker/Zotero

GPT Researcher

★ 27.2k

An autonomous agent that conducts deep research on any data using any LLM providers

Agent Reach

★ 20.1k

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

Xiaohongshu

★ 13.7k

MCP for xiaohongshu.com

Browse More Search & Data Extraction MCP Servers

Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Search & Data Extraction Browse All Servers

Set Up Web Scraping in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "web-scraping-mcp": {
      "command": "npx",
      "args": ["-y", "web-scraping-mcp"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Web Scraping?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides