How do I install PDF Extraction MCP Server?

Follow the setup instructions on the PDF Extraction GitHub repository, then add the server configuration to your AI client.

What category is PDF Extraction MCP Server?

PDF Extraction is categorized under Search & Data Extraction. Browse more servers in these categories on MCPgee.

PDF Extraction

Name: Mcp Pdf Extraction MCP Server
Author: xraywu

v1.0.0•Search & Data Extraction•stable

MCP server to extract contents from a PDF file

mcp-pdf-extractionmcpai-integration

Stars

Downloads

Weekly

0/5

View on GitHub

What is PDF Extraction?

PDF Extraction is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to mcp server to extract contents from a pdf file

MCP server to extract contents from a PDF file

This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

MCP server to extract contents from a PDF file

Use Cases

Extract content from PDF files for processing by Claude.

Automate document parsing in AI workflows.

xraywu

Maintainer

LicenseMIT

Languagepython

Versionv1.0.0

UpdatedMay 19, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx mcp-pdf-extraction

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use PDF Extraction

The PDF Extraction MCP Server gives Claude and other AI clients the ability to read content from local PDF files, including both text-based PDFs and scanned documents via OCR. It exposes a single focused tool — extract-pdf-contents — that accepts a local file path and an optional page selector, making it straightforward to pull specific pages or entire documents into an AI conversation for summarization, analysis, or data extraction. The server is a Python package built with the MCP SDK and depends on PyMuPDF, pytesseract, and pypdf2 for robust extraction and OCR support.

Prerequisites

Python 3.11 or higher
pip package manager
Tesseract OCR installed on the system for scanned PDF support (brew install tesseract on macOS)
Claude Desktop or another MCP client that supports stdio transport

Clone the repository

Clone the mcp-pdf-extraction-server repository to your machine.

git clone https://github.com/xraywu/mcp-pdf-extraction-server.git
cd mcp-pdf-extraction-server

Create a virtual environment and install

Set up an isolated Python environment and install the package in editable mode.

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e .

Find the installed command path

Locate the pdf-extraction binary that was installed. You will need the full path for the MCP client configuration.

which pdf-extraction
# Example output: /opt/homebrew/Caskroom/miniconda/base/bin/pdf-extraction

Add to Claude Code CLI (optional)

If using Claude Code CLI, register the server with the full path found above.

claude mcp add pdf-extraction /full/path/to/pdf-extraction
claude mcp list

Configure Claude Desktop

Add the pdf-extraction server to claude_desktop_config.json using the full absolute path to the installed command.

Restart Claude Desktop and verify

Restart Claude Desktop and open a new session. Type /mcp to confirm the pdf-extraction server shows as connected.

PDF Extraction Examples

Client configuration

Claude Desktop configuration for the PDF Extraction server. Replace the path with the actual output of 'which pdf-extraction' on your system.

{
  "mcpServers": {
    "pdf-extraction": {
      "command": "/opt/homebrew/Caskroom/miniconda/base/bin/pdf-extraction"
    }
  }
}

Prompts to try

Example prompts for extracting and working with PDF content through Claude.

- "Extract the content from the PDF at /Users/me/documents/report.pdf"
- "Read pages 1-3 from /home/user/contracts/agreement.pdf and summarize the key terms"
- "Extract the last page of /tmp/invoice.pdf using page selector -1"
- "Extract all text from /data/scanned-document.pdf (it's a scanned image PDF)"
- "Read /Users/me/research/paper.pdf and list the references section"

Troubleshooting PDF Extraction

Server not connecting after being added to Claude Desktop

Make sure you started a completely new Claude session. The path must point to the binary in the same Python environment where you ran pip install. Test the path directly in a terminal: running the binary should hang waiting for input (that is correct behavior for stdio MCP servers).

OCR fails or returns empty text for scanned PDFs

Install Tesseract OCR on your system: brew install tesseract on macOS, or sudo apt-get install tesseract-ocr on Linux. Verify with: tesseract --version. The pytesseract Python package must also be installed (included in requirements.txt).

ModuleNotFoundError when the server starts

The binary must run in the same Python environment where you installed the package. If you used a venv, the binary inside the venv (venv/bin/pdf-extraction) uses that environment automatically. Alternatively, use the Python module form: claude mcp add pdf-extraction /path/to/python -m pdf_extraction.

Frequently Asked Questions about PDF Extraction

What is PDF Extraction?

PDF Extraction is a Model Context Protocol (MCP) server that mcp server to extract contents from a pdf file It connects AI assistants to external tools and data sources through a standardized interface.

How do I install PDF Extraction?

Follow the installation instructions on the PDF Extraction GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with PDF Extraction?

PDF Extraction works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is PDF Extraction free to use?

Yes, PDF Extraction is open source and available under the MIT license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

PDF Extraction Alternatives — Similar Search & Data Extraction Servers

Looking for alternatives to PDF Extraction? Here are other popular search & data extraction servers you can use with Claude, Cursor, and VS Code.

TrendRadar

★ 58.0k

A real-time hotspot monitoring and news aggregation assistant that provides AI-powered analysis of trending topics across multiple platforms via the Model Context Protocol. It enables users to track news and receive automated notifications through va

Scrapling

★ 52.7k

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

PDF Math Translate

★ 33.9k

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/MCP/Docker/Zotero

GPT Researcher

★ 27.2k

An autonomous agent that conducts deep research on any data using any LLM providers

Agent Reach

★ 20.1k

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

Xiaohongshu

★ 13.7k

MCP for xiaohongshu.com

Browse More Search & Data Extraction MCP Servers

Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Search & Data Extraction Browse All Servers

Set Up PDF Extraction in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "mcp-pdf-extraction": {
      "command": "npx",
      "args": ["-y", "mcp-pdf-extraction"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use PDF Extraction?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides