How do I install LLM Search MCP Server?

Follow the setup instructions on the LLM Search GitHub repository, then add the server configuration to your AI client.

What category is LLM Search MCP Server?

LLM Search is categorized under Search & Data Extraction. Browse more servers in these categories on MCPgee.

LLM Search

Name: Llm Search MCP Server
Author: snexus

v1.0.0•Search & Data Extraction•stable

Querying local documents, powered by LLM

chatbotchromahydelangchain-pythonlarge-language-models

657

Stars

Downloads

Weekly

0/5

View on GitHub

What is LLM Search?

LLM Search is a Model Context Protocol (MCP) server that allows AI assistants like Claude, Cursor, and VS Code to querying local documents, powered by llm

Querying local documents, powered by LLM

This server falls under the Search & Data Extraction category on MCPgee, the world's largest MCP server directory with 33,000+ servers.

Features

Querying local documents, powered by LLM

Use Cases

Query local documents with LLM

Chroma vector store support

Hybrid search with HyDE

snexus

Maintainer

LicenseMIT

Languagejupyter notebook

Versionv1.0.0

UpdatedMay 19, 2026

Statushealthy

Maintenanceactive

Works with

ClaudeOpenAIwindowsmacoslinux

View Source Browse All Servers

Installation

Manual Installation

npx llm-search

Configuration

Configuration Details

Config File

claude_desktop_config.json

Performance

Response Metrics

Response Time< 200ms

ThroughputMedium

Resource Usage

Memory UsageLow

CPU UsageLow

How to Set Up and Use LLM Search

pyLLMSearch (llm-search) is a local RAG (Retrieval-Augmented Generation) system that exposes an MCP server, allowing AI assistants in Cursor, Windsurf, VS Code, or Claude to query your private document collections using natural language. It supports PDF, Markdown, and DOCX files and combines hybrid dense/sparse search, HyDE hypothetical document embeddings, cross-encoder re-ranking, and optional multi-querying to deliver highly relevant answers without sending your documents to the cloud. Developers and researchers use it to build a private knowledge base that their AI coding assistants can query directly.

Prerequisites

Python 3.10 or higher
pip or uv package manager
An LLM API key (OpenAI API key, or a local model via Ollama/LiteLLM)
A local folder of documents to index (PDF, Markdown, or DOCX files)
An MCP-compatible client such as Cursor, Windsurf, or VS Code with Copilot

Install pyLLMSearch

Install the package from PyPI using pip. It is recommended to use a virtual environment to avoid dependency conflicts.

python -m venv .venv
source .venv/bin/activate
pip install llmsearch

Create a documents configuration YAML

Create a YAML configuration file that tells pyLLMSearch where your documents are, which embedding model to use, and how to configure search. Use the sample templates from the repository as a starting point.

# docs_config.yaml
documents:
  - path: /path/to/your/documents
    extensions: [".pdf", ".md", ".docx"]

embeddings:
  model_name: sentence-transformers/all-MiniLM-L6-v2
  persist_dir: /path/to/embeddings-store

search:
  hyde_enabled: true
  reranking_enabled: true

Create an LLM model configuration YAML

Create a second YAML file that specifies which LLM to use for answering questions. You can use OpenAI, Azure OpenAI, or a local model via LiteLLM and Ollama.

# llm_config.yaml
model:
  type: openai
  model_name: gpt-4o
  api_key: sk-your-openai-api-key

Generate embeddings from your documents

Run the embedding generation step. This processes your documents and stores vector embeddings in the persist directory you configured. Re-run this step when you add new documents.

llmsearch index create --config docs_config.yaml

Start the MCP server

Launch pyLLMSearch as an SSE-based MCP server. Clients like Cursor, Windsurf, and VS Code can connect to it for RAG-powered document queries.

llmsearch app mcp --docs-config docs_config.yaml --llm-config llm_config.yaml

Configure your MCP client

Add the MCP server endpoint to your editor's MCP configuration. The server listens on localhost:8080 by default over SSE.

{
  "mcpServers": {
    "llm-search": {
      "url": "http://localhost:8080/sse"
    }
  }
}

LLM Search Examples

Client configuration

MCP client configuration for connecting to a running pyLLMSearch SSE server. The server must be started separately before the client connects.

{
  "mcpServers": {
    "llm-search": {
      "url": "http://localhost:8080/sse"
    }
  }
}

Prompts to try

Example prompts to use once pyLLMSearch is connected as an MCP server in your editor.

- "Search my documents for information about the authentication flow"
- "What does the onboarding guide say about setting up a new developer account?"
- "Find all mentions of the deprecation policy in my internal documentation"
- "Summarize the key points from my architecture decision records about the database choice"
- "What are the troubleshooting steps documented for connection timeout errors?"

Troubleshooting LLM Search

Embedding generation is slow or runs out of memory

Use a smaller embedding model such as sentence-transformers/all-MiniLM-L6-v2 instead of larger models. You can also reduce the chunk size in the documents configuration YAML. For large document sets, run the indexing step on a machine with more RAM or use the incremental update command to index new files only.

MCP server starts but the client cannot connect

Verify the server is listening by running `curl http://localhost:8080/sse`. Check that no firewall or port conflict is blocking port 8080. Ensure you are using the SSE endpoint URL (ending in /sse) and not the base URL in your client configuration.

Search results are irrelevant or the LLM gives wrong answers

Enable HyDE (hyde_enabled: true) and re-ranking (reranking_enabled: true) in your docs_config.yaml. Ensure your documents have been fully re-indexed after configuration changes by running `llmsearch index create` again. Check that the LLM model specified in llm_config.yaml has internet access or is available locally.

Frequently Asked Questions about LLM Search

What is LLM Search?

LLM Search is a Model Context Protocol (MCP) server that querying local documents, powered by llm It connects AI assistants to external tools and data sources through a standardized interface.

How do I install LLM Search?

Follow the installation instructions on the LLM Search GitHub repository. Clone the repo, install dependencies, and add the server config to your AI client.

Which AI clients work with LLM Search?

LLM Search works with all major MCP-compatible AI clients including Claude Desktop, Claude Code, Cursor, VS Code (GitHub Copilot), Windsurf, and Cline.

Is LLM Search free to use?

Yes, LLM Search is open source and available under the MIT license. You can use it freely in both personal and commercial projects.

Learn More About MCP Servers

Getting Started with MCP

Set up your first MCP server in minutes

MCP Setup Guide

Configure MCP in Claude, Cursor & VS Code

All MCP Tutorials

18+ hands-on guides for developers

MCP FAQ

40+ answers about Model Context Protocol

LLM Search Alternatives — Similar Search & Data Extraction Servers

Looking for alternatives to LLM Search? Here are other popular search & data extraction servers you can use with Claude, Cursor, and VS Code.

TrendRadar

★ 58.0k

A real-time hotspot monitoring and news aggregation assistant that provides AI-powered analysis of trending topics across multiple platforms via the Model Context Protocol. It enables users to track news and receive automated notifications through va

Scrapling

★ 52.7k

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

PDF Math Translate

★ 33.9k

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/MCP/Docker/Zotero

GPT Researcher

★ 27.2k

An autonomous agent that conducts deep research on any data using any LLM providers

Agent Reach

★ 20.1k

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

Xiaohongshu

★ 13.7k

MCP for xiaohongshu.com

Browse More Search & Data Extraction MCP Servers

Explore all search & data extraction servers available in the MCPgee directory. Each server includes setup guides for Claude, Cursor, and VS Code.

Search & Data Extraction Browse All Servers

Set Up LLM Search in Your Editor

Choose your AI client for step-by-step setup instructions.

🖥️

Claude Desktop

macOS & Windows app

⌨️

Claude Code

CLI & terminal

📝

Cursor

AI-first code editor

💻

VS Code

GitHub Copilot MCP

🏄

Windsurf

Codeium AI editor

🔌

Cline

VS Code extension

Quick Config Preview

{
  "mcpServers": {
    "llm-search": {
      "command": "npx",
      "args": ["-y", "llm-search"]
    }
  }
}

Add this to your claude_desktop_config.json or .cursor/mcp.json

Read the full setup guide →

Ready to use LLM Search?

Browse our complete directory of 33,000+ MCP servers, read setup guides for your editor, and start building with the Model Context Protocol.

33,000+ ServersFree & Open SourceStep-by-Step Guides

Explore All Servers Read Our Guides