Is the Puppeteer MCP server free to use?

Yes, the Puppeteer MCP server is completely free and open source. It runs a local Chrome browser on your machine, so there are no API costs. The only cost is the local compute resources (CPU, RAM) that Chrome uses.

Can I use Puppeteer MCP to scrape any website?

Technically yes, but you should always respect the target website's terms of service and robots.txt file. Some websites explicitly prohibit automated scraping. Use Puppeteer responsibly and only for legitimate purposes like testing your own sites, authorized data collection, or publicly available information.

Does Puppeteer MCP work on Windows?

Yes, the Puppeteer MCP server works on Windows, macOS, and Linux. On Windows, make sure Node.js is installed and npx is available in your PATH. See our Windows setup guide for MCP-specific Windows configuration tips.

How much memory does the Puppeteer MCP server use?

The Puppeteer MCP server uses 100-300 MB of RAM depending on the pages you visit. Complex pages with many images and JavaScript use more memory. The Chrome process stays running between requests, so memory usage persists until you restart the server.

Can I use Puppeteer MCP with a logged-in browser session?

Yes. Start Chrome with remote debugging enabled (--remote-debugging-port=9222), log in to the sites you need, then configure the MCP server to connect to that port. The server will use your existing session with all cookies and login state intact.

Should I use Puppeteer or Brave Search MCP for web research?

Use Brave Search for general web searches where you need text results from multiple sites. Use Puppeteer when you need to visit a specific URL and interact with it - extract data from a specific page, take screenshots, fill forms, or handle JavaScript-heavy content. Many users run both servers simultaneously.

MCPgeeMCP Server Discovery

Explore Servers

Tutorial

Web Scraping with Puppeteer MCP Server - Setup & Use Cases

Set up the Puppeteer MCP server for AI-powered web scraping. Covers data extraction, screenshots, PDF generation, and comparison with Firecrawl and Brave Search.

What Is the Puppeteer MCP Server?

The Puppeteer MCP server gives AI assistants direct control over a headless Chrome browser. Instead of just fetching HTML like a simple HTTP client, Puppeteer can render JavaScript, interact with dynamic pages, fill out forms, take screenshots, generate PDFs, and extract data from single-page applications that don't work with traditional scraping tools.

This makes it one of the most powerful MCP servers for web automation. You can ask your AI assistant to "go to this website, screenshot it, and extract all product prices" and it will use Puppeteer to do exactly that - navigating the page, waiting for dynamic content to load, and pulling structured data.

This guide covers setup, common use cases, anti-detection strategies, and how Puppeteer compares to other web-related MCP servers like Brave Search and Firecrawl.

Setup and Configuration

Basic Setup

The Puppeteer MCP server runs via npx and automatically downloads a compatible Chromium binary. Add it to your client configuration:

{
  "mcpServers": {
    "puppeteer": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-puppeteer"]
    }
  }
}

Advanced Configuration

For more control, configure Chrome launch options via environment variables:

{
  "mcpServers": {
    "puppeteer": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-puppeteer"],
      "env": {
        "PUPPETEER_CHROME_PORT": "9222",
        "PUPPETEER_HEADLESS": "true",
        "PUPPETEER_LAUNCH_ARGS": "--no-sandbox --disable-setuid-sandbox --disable-dev-shm-usage"
      }
    }
  }
}

The --no-sandbox flag is required on Linux systems and in Docker containers. The --disable-dev-shm-usage flag prevents crashes in memory-constrained environments.

Using an Existing Chrome Instance

If you want Puppeteer to connect to an already-running Chrome browser (useful for maintaining login sessions), start Chrome with remote debugging enabled:

# Start Chrome with remote debugging
google-chrome --remote-debugging-port=9222

# Then configure the MCP server to connect
{
  "mcpServers": {
    "puppeteer": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-puppeteer"],
      "env": {
        "PUPPETEER_CHROME_PORT": "9222"
      }
    }
  }
}

Common Use Cases

1. Data Extraction

The most common use case is extracting structured data from websites. Unlike simple HTTP requests, Puppeteer renders JavaScript and waits for dynamic content, making it ideal for modern SPAs.

Example prompts you can use with your AI assistant:

"Go to [URL] and extract all product names, prices, and ratings into a table"
"Navigate to [URL], click the 'Load More' button 3 times, then extract all article titles and dates"
"Visit [URL] and extract the data from the chart/table on the page"

2. Screenshots

Puppeteer can take full-page screenshots or capture specific elements. This is useful for documentation, bug reporting, visual regression testing, and design review.

"Take a screenshot of [URL]"
"Screenshot the hero section of [URL] at mobile width (375px)"
"Take screenshots of [URL] at 3 different viewport sizes: mobile, tablet, desktop"

3. PDF Generation

Convert web pages to PDF documents. Useful for archiving content, generating reports, or creating printable versions of web pages.

"Convert [URL] to a PDF"
"Generate a PDF of this invoice page with A4 paper size"

4. Form Interaction and Testing

Puppeteer can fill out forms, click buttons, and interact with page elements. This makes it useful for testing web applications or automating repetitive web tasks.

"Go to [URL], fill in the search form with 'MCP servers', and extract the results"
"Navigate to the login page, enter test credentials, and verify the dashboard loads"

5. Monitoring and Auditing

Use Puppeteer to check website status, verify content, or audit pages for issues:

"Check if [URL] loads correctly and report any console errors"
"Visit [URL] and tell me if the pricing has changed from [old price]"

Anti-Bot Detection and Evasion

Many websites detect and block headless browsers using JavaScript fingerprinting, behavioral analysis, and header inspection. If you are using Puppeteer for legitimate scraping (testing your own sites, authorized data collection, or publicly available information), these techniques help avoid false positive bot detection:

Detection Method	What It Checks	Countermeasure
User agent string	"HeadlessChrome" in the UA	Set a standard Chrome UA string
Viewport size	Default 800x600 headless viewport	Use 1920x1080 or 1440x900
navigator.webdriver	JS property set to true in headless	Override with evaluateOnNewDocument
Click timing	Instantaneous 0ms clicks	Add random 100-500ms delays
Missing plugins	Headless reports 0 browser plugins	Inject fake plugin array via JS

Handle cookies and headers: Accept cookie banners and send standard HTTP headers that real browsers send, including Accept-Language and Accept-Encoding.
Respect robots.txt: Always check and honor the site's robots.txt. Automated scraping that ignores robots.txt may violate terms of service and applicable laws.
Rate limit requests: Do not scrape hundreds of pages per minute from a single domain. Add delays between page navigations to avoid triggering rate limits.

Important: Always ensure your web scraping activities comply with the target website's terms of service and applicable laws. Use Puppeteer responsibly and ethically.

CSS Selector Strategies

When asking the AI to extract data from web pages, providing good selector hints improves extraction reliability. Prefer stable selectors over fragile ones:

Data attributes first: Elements with data-testid, data-id, or data-name are the most stable because they survive design changes.
ARIA roles: Selectors like [role="navigation"], [role="main"], [aria-label="Search"] are semantic and stable.
Avoid positional selectors: div:nth-child(3) > span breaks when layout changes. Prefer class names or semantic attributes.
Text content matching: For buttons and links, matching by visible text is often more stable than class-based selectors.
Wait for dynamic content: SPAs load content after the initial page load. Ask the AI to wait for elements to appear before extracting data.

PDF Generation

Puppeteer excels at converting web pages to high-quality PDFs. The AI can generate PDFs with custom page sizes, margins, headers, and footers. Common use cases include archiving web content, generating reports from dashboards, and creating printable invoices. When asking the AI to generate PDFs, be specific: "Generate a PDF of this page with A4 paper size and 1-inch margins" or "Create a landscape PDF of this dashboard with the navigation bar hidden." Generated PDFs typically range from 100 KB for text-heavy pages to 5 MB for image-heavy pages.

Scheduled and Recurring Scraping

While the Puppeteer MCP server does not include a built-in scheduler, you can combine it with other tools for recurring scraping workflows. Use the Memory MCP server to store previous values and compare on each check. Ask the AI "Visit [URL] and check if the pricing has changed from what we stored last week." For large-scale or very frequent scraping, consider Firecrawl MCP instead, which handles rate limiting and proxy rotation at the infrastructure level.

Comparison: Puppeteer vs Brave Search vs Firecrawl

There are several MCP servers for web-related tasks. Here's how they compare:

Feature	Puppeteer MCP	Brave Search MCP	Firecrawl MCP
Type	Browser automation	Search engine API	Cloud scraping API
JavaScript rendering	Yes (full Chrome)	No	Yes (cloud)
Screenshots	Yes	No	Yes
Form interaction	Yes	No	No
Runs locally	Yes	API calls	Cloud only
Cost	Free (local resources)	API key (free tier available)	Paid subscription
Best for	Full page interaction	Web search queries	Large-scale scraping

When to use Puppeteer: When you need to interact with pages (click, type, scroll), take screenshots, handle JavaScript-heavy sites, or work without an API key. It's the most versatile option but uses the most local resources.

When to use Brave Search: When you need to search the web and get text results. It's fast, lightweight, and doesn't require a browser. Great for research and fact-checking.

When to use Firecrawl: When you need to scrape many pages at scale with built-in anti-detection. It runs in the cloud so your local machine isn't impacted. Best for bulk data extraction.

You can also use Exa Search for semantic search capabilities that complement Puppeteer's scraping. For database integration with scraped data, see our database MCP servers guide.

Performance Considerations

Puppeteer launches a full Chrome browser, which means significant resource usage:

Memory: Each Chrome instance uses 100-300 MB of RAM. If you're running other MCP servers too, monitor total memory usage.
CPU: Page rendering is CPU-intensive. Complex pages with heavy JavaScript may spike CPU usage temporarily.
Disk: Chromium binary is ~170 MB. Screenshots and PDFs consume additional disk space.
Startup time: Chrome takes 2-5 seconds to launch. First requests will be slower than subsequent ones since the browser stays running.

On systems with limited resources (8 GB RAM or less), consider closing the Puppeteer server when not actively using it, or use Brave Search for simple web queries that don't require a full browser.

Troubleshooting

Chrome fails to launch on Linux: Install required system dependencies with apt install -y libnss3 libatk-bridge2.0-0 libdrm2 libxkbcommon0 libgbm1. Use the --no-sandbox launch argument in your env config.
Timeout errors: Puppeteer operations can be slow, especially on first page load. See our timeout troubleshooting guide for configuration options.
ENOENT errors: The Chromium binary may not have downloaded correctly. Delete the node_modules cache and the _npx cache directory, then retry. See spawn ENOENT fix.
Pages not rendering correctly: Some sites require specific viewport sizes or user agents. Ask the AI to set a realistic viewport (1920x1080) and user agent before navigating to the target URL.
Docker issues: Running Puppeteer in Docker requires additional flags: --no-sandbox --disable-setuid-sandbox --disable-dev-shm-usage. Add these to PUPPETEER_LAUNCH_ARGS in your server env config.
Windows-specific issues: On Windows, Puppeteer may fail if the Chromium download path contains spaces. See our Windows MCP setup guide for path configuration.

Combining Puppeteer with Other MCP Servers

Puppeteer becomes even more powerful when combined with other MCP servers in your configuration:

Puppeteer + Memory: Scrape a page, extract key data, and store it in the knowledge graph for cross-session access. "Visit our competitor's pricing page and save the current prices to memory for future comparison."
Puppeteer + Brave Search: Search the web for relevant pages, then use Puppeteer to visit and scrape specific results. "Search for React component libraries, visit the top 3 results, and compare their feature lists."
Puppeteer + Filesystem: Take screenshots or generate PDFs and save them to your local project directory. "Screenshot our staging site at 3 viewport sizes and save the images to the test/screenshots folder."
Puppeteer + Database: Scrape structured data from web pages and insert it into your database. "Extract the product catalog from this page and insert each product into the products table."

For configuring multiple servers together, see our multiple server configuration guide. For memory considerations when running Puppeteer alongside other servers, see our how many servers guide.

Ethical Web Scraping Guidelines

When using Puppeteer MCP for web scraping, follow these ethical guidelines:

Respect robots.txt: Check the target site's robots.txt before scraping. Most sites document which paths and user agents are allowed or disallowed.
Honor rate limits: Do not send more than a few requests per minute to any single domain. Excessive requests can overload servers and may result in IP bans.
Check terms of service: Many websites explicitly prohibit automated scraping in their terms of service. Review these before scraping commercial sites.
Identify yourself: When possible, use a user agent string that includes your contact information or project name so site operators can reach you if needed.
Cache results: Avoid re-scraping the same page repeatedly. Cache results locally and only re-scrape when data freshness requires it.
Avoid scraping personal data: Be especially careful with pages that contain personal information. Comply with GDPR, CCPA, and other data protection regulations applicable to your jurisdiction.

Frequently Asked Questions

Related Guides

Troubleshooting

MCP Server Timeout Errors - Fix & Configure Timeouts

Diagnose and fix MCP server timeout errors across all clients. Learn how to configure timeouts for Claude Desktop, Cursor, and VS Code.

Troubleshooting

Fix MCP Server spawn ENOENT Error

Comprehensive guide to fixing the MCP server spawn ENOENT error across all platforms and clients.

Tutorial

Configure Multiple MCP Servers - Run 5-10+ Servers Simultaneously

Learn how to configure and run multiple MCP servers simultaneously. Covers config examples, naming conventions, port management, and memory optimization.

Tutorial

MCP Server Setup on Windows - Complete Configuration Guide

Complete guide to setting up MCP servers on Windows. Covers Node.js and Python installation, PowerShell path issues, config file locations, WSL vs native, and Windows-specific errors.

Reference

How Many MCP Servers Can You Run?

Practical limits for running multiple MCP servers across different clients.

Reference

MCP Server Environment Variables

The definitive guide to configuring environment variables for MCP servers across all clients.

Ready to explore MCP servers?

Browse 100+ curated MCP servers

Step-by-step setup tutorials

Community-driven reviews and ratings

Browse Servers View Tutorials