BM25 Turbo
The fastest BM25 scoring engine: 2,300x faster than BM25S. 28K QPS on 8.8M docs. 5 BM25 variants (Robertson, Lucene, ATIRE, BM25L, BM25+). Memory-mapped persistence, BMW pruning, streaming indexing. Built-in HTTP server, MCP tool, HuggingFace Hub int
What is BM25 Turbo?
BM25 Turbo is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to fastest bm25 scoring engine: 2,300x faster than bm25s. 28k qps on 8.8m docs. 5 bm25 variants (robertson, lucene, atire, bm25l, bm25+). memory-mapped persistence, bmw pruning, streaming indexing. built...
The fastest BM25 scoring engine: 2,300x faster than BM25S. 28K QPS on 8.8M docs. 5 BM25 variants (Robertson, Lucene, ATIRE, BM25L, BM25+). Memory-mapped persistence, BMW pruning, streaming indexing. Built-in HTTP server, MCP tool, HuggingFace Hub int
This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.
Features
- The fastest BM25 scoring engine: 2,300x faster than BM25S. 2
Use Cases
Maintainer
Works with
Installation
Manual Installation
npx bm25-turbo-rust-python-wasm-cliConfiguration
Configuration Details
claude_desktop_config.json
Performance
Response Metrics
Resource Usage
How to Set Up and Use BM25 Turbo
BM25 Turbo is an ultra-high-performance text search engine built in Rust that supports five BM25 scoring variants (Robertson, Lucene, ATIRE, BM25L, and BM25+) and achieves 28,000 queries per second on corpora of 8.8 million documents. It is available as a CLI tool, a Python library, a WASM module, and a built-in MCP server, making it easy to add fast full-text search to AI workflows without external search infrastructure. Developers can index large document collections locally, expose the index as an MCP tool, and let AI agents issue structured search queries against it with sub-10ms median latency.
Prerequisites
- Rust toolchain (cargo) for installing the CLI, or Python 3.8+ for the Python package, or Node.js for the WASM module
- A corpus of documents in JSONL format to index
- Sufficient RAM or disk for memory-mapped index files (index size depends on corpus)
- An MCP-compatible AI client such as Claude Desktop or Cursor
Install BM25 Turbo
Choose the installation method that matches your preferred runtime. The CLI via Cargo provides the full feature set including the MCP server.
# CLI (full features, recommended)
cargo install bm25-turbo-cli
# Python library
pip install bm25-turbo
# JavaScript / WASM
npm install bm25-turbo-wasmPrepare your corpus
Format your documents as a JSONL file where each line is a JSON object containing at least one text field to index. The --field flag tells the indexer which key to use.
# Example corpus.jsonl
{"id": "1", "text": "Introduction to information retrieval"}
{"id": "2", "text": "BM25 ranking function explained"}Index your corpus
Build a binary index file from your JSONL corpus. This step only needs to be done once; the index can be reused across sessions.
bm25-turbo index --input corpus.jsonl --output index.bm25 --field textLaunch the MCP server
Start the built-in MCP server, pointing it at your pre-built index file.
bm25-turbo mcp --index index.bm25Configure your MCP client
Add the server to your client configuration so it can launch the MCP server automatically.
Verify with a search query
Test the index directly from the CLI before connecting your AI client.
bm25-turbo search --index index.bm25 --query "information retrieval" -k 10BM25 Turbo Examples
Client configuration
Claude Desktop config that launches the BM25 Turbo MCP server with a pre-built index file.
{
"mcpServers": {
"bm25-turbo": {
"command": "bm25-turbo",
"args": [
"mcp",
"--index",
"/absolute/path/to/index.bm25"
]
}
}
}Prompts to try
After connecting, ask the AI to search your indexed corpus using the bm25_search and bm25_index_stats tools.
- "Search the index for 'machine learning optimization' and return the top 5 results with their scores."
- "How many documents are in the index and what is the total vocabulary size?"
- "Search for 'neural network architecture' using k=20 and summarize the themes across the top results."Troubleshooting BM25 Turbo
cargo install fails with linker errors
BM25 Turbo requires a C linker. On macOS run 'xcode-select --install'; on Ubuntu/Debian run 'sudo apt install build-essential'. Then retry 'cargo install bm25-turbo-cli'.
MCP server reports 'index file not found'
The --index path must be an absolute path to the .bm25 file produced by 'bm25-turbo index'. Run the index command first, then verify the file exists before starting the MCP server.
Search returns no results for queries that should match
Confirm the --field value used during indexing matches the JSON key in your corpus. If you indexed on 'text' but your documents use 'content', re-index with --field content. Also try a simpler single-word query to rule out stopword filtering.
Frequently Asked Questions about BM25 Turbo
What is BM25 Turbo?
BM25 Turbo is a Model Context Protocol (MCP) server that fastest bm25 scoring engine: 2,300x faster than bm25s. 28k qps on 8.8m docs. 5 bm25 variants (robertson, lucene, atire, bm25l, bm25+). memory-mapped persistence, bmw pruning, streaming indexing. built-in http server, mcp tool, huggingface hub int It connects AI assistants to external tools and data sources through a standardized interface.
How do I install BM25 Turbo?
Follow the installation instructions on the BM25 Turbo GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.
Which AI clients work with BM25 Turbo?
BM25 Turbo works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.
Is BM25 Turbo free to use?
Yes, BM25 Turbo is open source and available under the AGPL-3.0 license. You can use it freely in both personal and commercial projects.
BM25 Turbo Alternatives — Similar Search & Data Extraction Servers
Looking for alternatives to BM25 Turbo? Here are other popular search & data extraction servers you can use with Claude, Cursor, and VS Code.
TrendRadar
★ 58.0kA real-time hotspot monitoring and news aggregation assistant that provides AI-powered analysis of trending topics across multiple platforms via the Model Context Protocol. It enables users to track news and receive automated notifications through va
Scrapling
★ 52.7k🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
PDF Math Translate
★ 33.9k[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
GPT Researcher
★ 27.2kAn autonomous agent that conducts deep research on any data using any LLM providers
Agent Reach
★ 20.1kGive your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.
Xiaohongshu
★ 13.7kMCP for xiaohongshu.com
Browse More Search & Data Extraction MCP Servers
Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.
Set Up BM25 Turbo in Your Editor
Choose your AI client for step-by-step setup instructions.
Quick Config Preview
Add this to your claude_desktop_config.json or .cursor/mcp.json
Ready to use BM25 Turbo?
Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.