Mineru Tianshu

v1.0.0Search & Data Extractionstable

天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取

deepseek-ocrmarkitdownmcp-serverminerupaddleocr-vl
Share:
647
Stars
0
Downloads
0
Weekly
0/5

What is Mineru Tianshu?

Mineru Tianshu is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to 天枢 - 企业级 ai 一站式数据预处理平台 | pdf/office转markdown | 支持mcp协议ai助手集成 | vue3+fastapi全栈方案 | 文档解析 | 多模态信息提取

天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取

This server falls under the Search & Data Extraction and Cloud Services categories on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

  • 天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成

Use Cases

Convert PDF and Office documents to clean Markdown format at scale. Extract multi-modal information and structured data from documents. Deploy as a full-stack solution for document processing.
magicyuan876

Maintainer

LicenseApache-2.0
Languagepython
Versionv1.0.0
UpdatedMay 22, 2026
Statushealthy
Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

Installation

Manual Installation

npx mineru-tianshu

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms
ThroughputMedium

Resource Usage

Memory UsageLow
CPU UsageLow

How to Set Up and Use Mineru Tianshu

Mineru Tianshu (天枢) is an enterprise-grade AI data preprocessing platform built on MinerU, PaddleOCR-VL, and FastAPI that converts PDFs, Office documents, images, audio, and video into structured Markdown and JSON suitable for AI ingestion. It supports GPU acceleration, parallel processing of large PDFs (auto-split above 500 pages), multi-language OCR across 109+ languages, and exposes an MCP protocol endpoint so AI assistants like Claude can submit documents for processing and retrieve clean structured output.

Prerequisites

  • Docker 20.10+ and Docker Compose 2.0+ (recommended deployment method)
  • Node.js 18+ and Python 3.8+ (for local development deployment)
  • NVIDIA Container Toolkit and CUDA-compatible GPU (optional, for GPU-accelerated OCR)
  • An MCP client such as Claude Desktop that supports SSE transport
1

Clone the repository

Clone the mineru-tianshu repository to your server or local machine.

git clone https://github.com/magicyuan876/mineru-tianshu.git
cd mineru-tianshu
2

Deploy using Docker Compose (recommended)

Use the provided Makefile or deployment script for a one-command setup. This starts the frontend (port 80), backend API (port 8000), worker (port 8001), and MCP server (port 8002).

make setup
# Or on Linux/macOS:
./scripts/docker-setup.sh
# Or on Windows:
scripts\docker-setup.bat
3

Alternatively, deploy locally without Docker

For local development, install backend dependencies and start all services. Use the --enable-mcp flag to activate the MCP endpoint.

cd backend
bash install.sh
python start_all.py --enable-mcp
4

Start the frontend

In a separate terminal, install frontend dependencies and start the Vue 3 development server.

cd frontend
npm install
npm run dev
5

Configure Claude Desktop to connect via SSE

Add the MCP server to your Claude Desktop configuration using SSE transport. The MCP server runs on port 8002 by default.

{
  "mcpServers": {
    "mineru-tianshu": {
      "url": "http://localhost:8002/sse",
      "transport": "sse"
    }
  }
}
6

Submit documents for processing through Claude

With the MCP server connected, ask Claude to convert a document. The platform returns Markdown and JSON output with images uploaded to object storage if RustFS is configured.

Mineru Tianshu Examples

Client configuration

Claude Desktop configuration for Mineru Tianshu using SSE transport. The MCP server must be running at localhost:8002.

{
  "mcpServers": {
    "mineru-tianshu": {
      "url": "http://localhost:8002/sse",
      "transport": "sse"
    }
  }
}

Prompts to try

Example prompts for converting documents and extracting information through the MCP interface.

- "Convert this PDF to Markdown: /path/to/document.pdf"
- "Extract all tables from this Word document and return them as JSON"
- "Process this scanned PDF and identify all figures and their captions"
- "Convert the uploaded PowerPoint file to Markdown preserving the slide structure"
- "Transcribe the audio from this MP4 file and identify different speakers"

Troubleshooting Mineru Tianshu

Docker deployment fails with GPU-related errors

GPU support requires the NVIDIA Container Toolkit. Install it following the NVIDIA documentation, then run 'docker run --gpus all nvidia/cuda:12.6.0-base-ubuntu22.04 nvidia-smi' to verify. If you do not have a GPU, the platform falls back to CPU processing, which is slower but fully functional.

MCP server at port 8002 is not reachable from Claude Desktop

Confirm the MCP service is running by checking 'docker-compose logs mcp' or 'make logs'. If you are running Claude Desktop on a different machine, replace 'localhost' with the server's IP address. Ensure port 8002 is open in any firewall rules.

Large PDF processing times out or fails

PDFs over 500 pages are automatically split into parallel sub-tasks. Adjust PDF_SPLIT_THRESHOLD_PAGES and PDF_SPLIT_CHUNK_SIZE in your .env file. For memory issues, adjust WORKER_MEMORY_LIMIT (default 16G) to match your available RAM.

Frequently Asked Questions about Mineru Tianshu

What is Mineru Tianshu?

Mineru Tianshu is a Model Context Protocol (MCP) server that 天枢 - 企业级 ai 一站式数据预处理平台 | pdf/office转markdown | 支持mcp协议ai助手集成 | vue3+fastapi全栈方案 | 文档解析 | 多模态信息提取 It connects AI assistants to external tools and data sources through a standardized interface.

How do I install Mineru Tianshu?

Follow the installation instructions on the Mineru Tianshu GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with Mineru Tianshu?

Mineru Tianshu works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is Mineru Tianshu free to use?

Yes, Mineru Tianshu is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.

Browse More Search & Data Extraction MCP Servers

Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Quick Config Preview

{ "mcpServers": { "mineru-tianshu": { "command": "npx", "args": ["-y", "mineru-tianshu"] } } }

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use Mineru Tianshu?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides