MCPBench
MCPBench is the open standard for compatibility testing of various Model Context Protocol (MCP https://github.com/modelcontextprotocol) client and server SDK implementations.
What is MCPBench?
MCPBench is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to mcpbench is the open standard for compatibility testing of various model context protocol (mcp https://github.com/modelcontextprotocol) client and server sdk implementations.
MCPBench is the open standard for compatibility testing of various Model Context Protocol (MCP https://github.com/modelcontextprotocol) client and server SDK implementations.
This server falls under the Developer Tools category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- MCPBench is the open standard for compatibility testing of v
Use Cases
Maintainer
Works with
Installation
Manual Installation
npx mcpbenchConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use MCPBench
MCPBench is an open-source evaluation framework from ModelScope that tests the compatibility and task performance of MCP server and client SDK implementations. It runs standardized benchmarks across three categories — web search, database query, and general-purpose (GAIA) tasks — automatically discovering each server's tools and parameters without manual configuration. MCP server developers and SDK authors use MCPBench to validate that their implementations conform to the Model Context Protocol specification and to compare the real-world task completion rates of different servers before integrating them into production agents.
Prerequisites
- Python 3.11 or newer
- Conda (miniconda or anaconda) for environment management
- Node.js (for testing npm-based MCP servers)
- jq command-line tool (for processing evaluation results)
- API keys for any MCP servers under test (e.g., FIRECRAWL_API_KEY for the Firecrawl web search server)
Clone the MCPBench repository
Clone the official MCPBench repository from ModelScope's GitHub organization.
git clone https://github.com/modelscope/MCPBench.git
cd MCPBenchCreate and activate a Conda environment
MCPBench requires Python 3.11. Create a dedicated Conda environment to avoid dependency conflicts.
conda create -n mcpbench python=3.11 -y
conda activate mcpbenchInstall Python dependencies
Install all required Python packages from the requirements file.
pip install -r requirements.txtCreate a benchmark configuration file
Create a JSON config file in the configs/ directory that specifies which MCP servers to test. You can mix remote SSE servers and local stdio servers. API keys for tested servers go in the run_config args.
# configs/my_benchmark.json
{
"mcp_pool": [
{
"name": "Remote MCP example",
"url": "https://your-mcp-server.example.com/mcp"
},
{
"name": "firecrawl",
"run_config": [{
"command": "npx -y firecrawl-mcp",
"args": "FIRECRAWL_API_KEY=your-key-here",
"port": 8005
}]
}
]
}Launch local MCP servers and run the evaluation
For local stdio-based servers, first launch them as SSE endpoints. Then run the evaluation script for your benchmark category (websearch, db, or gaia).
# Step 1: launch local servers as SSE (if needed)
sh launch_mcps_as_sse.sh configs/my_benchmark.json
# Step 2: run the evaluation
sh evaluation_websearch.sh configs/my_benchmark.json
# or:
sh evaluation_db.sh configs/my_benchmark.json
# or:
sh evaluation_gaia.sh configs/my_benchmark.jsonMCPBench Examples
Client configuration
MCPBench is a benchmarking framework, not a traditional MCP server. The config below shows how to define a benchmark pool with one remote and one local server for evaluation.
{
"mcp_pool": [
{
"name": "brave-search",
"run_config": [{
"command": "npx -y @modelcontextprotocol/server-brave-search",
"args": "BRAVE_API_KEY=your-brave-api-key",
"port": 8001
}]
},
{
"name": "duckduckgo-search",
"run_config": [{
"command": "npx -y @modelcontextprotocol/server-duckduckgo",
"args": "",
"port": 8002
}]
}
]
}Prompts to try
MCPBench is run from the command line; these are the key commands for running different benchmark suites.
- Run web search benchmark: sh evaluation_websearch.sh configs/my_benchmark.json
- Run database query benchmark: sh evaluation_db.sh configs/my_benchmark.json
- Run GAIA general benchmark: sh evaluation_gaia.sh configs/my_benchmark.json
- Launch local servers as SSE: sh launch_mcps_as_sse.sh configs/my_benchmark.jsonTroubleshooting MCPBench
'jq: command not found' when running evaluation scripts
Install jq via your system package manager: 'brew install jq' on macOS, 'sudo apt install jq' on Ubuntu/Debian, or 'conda install -c conda-forge jq' in the Conda environment.
Local MCP server fails to start in launch_mcps_as_sse.sh
Verify Node.js is installed and the npx command is on PATH in the Conda environment. For servers requiring API keys, confirm the args field in run_config has the correct KEY=value format. Check that the specified port is not already in use.
Evaluation script exits with 'No tools found for server'
MCPBench auto-discovers tools by querying the server's /tools endpoint. If the server is not running or the URL is wrong, no tools are found. Confirm the server started successfully with 'curl http://localhost:<port>/health' before running the evaluation.
Frequently Asked Questions about MCPBench
What is MCPBench?
MCPBench is a Model Context Protocol (MCP) server that mcpbench is the open standard for compatibility testing of various model context protocol (mcp https://github.com/modelcontextprotocol) client and server sdk implementations. It connects AI assistants to external tools and data sources through a standardized interface.
How do I install MCPBench?
Follow the installation instructions on the MCPBench GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.
Which AI clients work with MCPBench?
MCPBench works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is MCPBench free to use?
Yes, MCPBench is open source and available under the Apache-2.0 license. You can use it freely in both personal and commercial projects.
MCPBench Alternatives — Similar Developer Tools Servers
Looking for alternatives to MCPBench? Here are other popular developer tools servers you can use with Claude, Cursor, and VS Code.
Ecc
★ 188.2kThe agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Javaguide
★ 155.8kJava 面试 & 后端通用面试指南,覆盖计算机基础、数据库、分布式、高并发、系统设计与 AI 应用开发
Gemini CLI
★ 104.5kA secure MCP server that wraps the Google Gemini CLI, allowing clients to query Gemini models using local OAuth sessions without requiring an API key. It provides tools for model interaction and diagnostics with built-in protection against command in
Awesome MCP Servers
★ 87.3k⭐ Curated list of Model Context Protocol (MCP) servers - tools that extend Claude Desktop, Cursor, Windsurf, and other MCP clients with custom capabilities.
MCP Servers
★ 86.0kModel Context Protocol Servers
CC Switch
★ 77.5kA cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io
Browse More Developer Tools MCP Servers
Explore all developer tools servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up MCPBench in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use MCPBench?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.